TOPICS IN 
NUMBER THEORY 


VOLUME II 




This book is in the 

ADDISOX-WESLEY SERIES IN -MATHEMATICS 
Eric Reissnkr, Consulting Editor 



TOPICS IN 
NUMBER THEORY 

VOLUME II 



WILLIAM JUDSON LeVEQUE 

Department of Mathematics 
University of Michigan 




ADDISON-WESLEY publishing company, Inc. 

RK.\DINO, MASSACHUSETTS, U.S.A. 

LONDON, ENGLAND 





































































































































































































PREFACE 


This book is a treatment of some advanced topics in the theory of 
numbers. It was written to follow the author’s ‘'Topics in Nninhcr 
Theory, Volume I,” in which elementary number theory is pre.senled. 

The level of mathematical maturity required f’or Volume 11 is 
much higher than for Volume I. Moreover, results obtained in 
Volume I are used freely, and in several cjfahe chapters a knowledge of 
specific topics in various other branches Bf mathematics is assumed. 
In particular, knowledge of the theory qf’symmetric polynomials, as 
well as the rule for multiplying deteci^inants, is needed for the 
algebraic theory in Chapter 3, and the theory of analytic functions is 
used both in the theorem of Schneider in Chapter 5 and in the in¬ 
vestigation of the distribution of primes in Chapter 7. There seemed 
to be no point in assuming background unnecessarily, however, so I 
have included brief discussions of groups and matrices, on a very 
elementary level, in Chapter 1. 

The treatment of quadratic forms, admittedly shallow', has been 
based on the properties of the modular group for two reasons. In the 
first place, the geometric interpretation makes the usual definition of 
reduced forms seem quite natural, while no real insight is afforded by 
merely listing an unmotivated set of inequalities. In the second 
place, this treatment provides a simple illustration of the power of the 
theory associated with elliptic functions, which is of considerable 
importance in modern number theory. Such methods are not often 
taught in American universities, and I hope that this treatment may 
serve to stimulate interest in them. 

To the best of my knowledge, the algebraic form of the Thue- 

Siegel-Roth theorem given in Chapter 4 has not previously appeared 
in print. 


Ann Arbor, Michigan 
November 1955 


V 


I 




CONTENTS 


CHAPTER 1 Binary Quadratic Forms . 1 

1-1 Introduction. \ 

1-2 Groups. ^ 

1-3 The modular group. 8 

1-4 Reduced definite forms.15 

1-5 Reduction of definite forms.17 

1-6 Representations by definite forms.18 

1-7 Indefinite forms.22 

1-8 The automorphs of indefinite forms.24 

1-9 Reduction of indefinite forms.29 

1- 10 Representations.33 

CHAPTER 2 Algebraic Numbers.34 

2- 1 Introduction.34 

2-2 Polynomials and algebraic numbers.38 

2-3 Algebraic integers.47 

2-4 Units and primes in R (i?).53 

2-5 Ideals.57 

2-6 The arithmetic of ideals.62 

2-7 Congruences. The norm of an ideal.67 

2-8 Prime ideals.72 

2-9 Units of algebraic number fields.74 


CHAPTER 3 Applications to Rational Number Theory 

3-1 Introduction. 

3-2 Equivalence and class number. 

3-3 The cyclotomic field Kp . 

3-4 Fermat’s equation. 

3-5 Kummor’s theorem. 

3-6 The equation + 2 = . 

3-7 Pure cubic fields. 

3-8 Two lemmas. 

3- 9 The Delaunay-Nagell theorem. 

chapter 4 The Thue-Siegel-Roth Theorem .... 

4- 1 Introduction. 

4-2 Polynomials. 

4-3 Generalized Wronskians. 

4-4 The index. 

4-5 A combinatorial lemma. 

4-6 The approximation polynomial. 

vii 


82 

82 

82 

85 

93 

97 

103 

104 
109 
112 

121 

121 

124 

128 

134 

142 

144 







































Vlll 


CONTENTS 


4-7 The Thue-Siegel-Roth theorem.148 

4-8 Applications to Diophantine equations.152 

4- 9 A special equation.154 

CHAPTER 5 Irrationality and Transcendence . . .161 

5- 1 Irrational numbers.161 

5-2 The existence of transcendental numbers.165 

5-3 A criterion for transcendence.167 

5-4 Measure of transcendence. Mahler’s classification 170 

5-5 Arithmetic properties of the exponential function. 174 

5-6 A theorem of Schneider.186 

5- 7 The Hilbert-Gelfond-Schneider theorem.198 

CHAPTER 6 DiRICHLEt’S ThEOREM.201 

6- 1 Introduction.201 

6-2 Characters.207 

6-3 The L-functions.214 

6-4 Nonelementary proof of Dirichlet’s theorem.215 

6-5 Elementary proof of Dirichlet’s theorem.218 

6- 6 ProofthatLd, x)5^ 0.221 

chapter 7 The Prime Number Theorem.229 

7- 1 Introduction.229 

7-2 Preliminary results.232 

7-3 The Prime Number Theorem.240 

7-4 Extension to primes in an arithmetic progression. 252 

7-5 The integers representable as a sum of two squares . 257 

Supplementary Reading.264 

List of Symbols.267 

Index.269 






















CHAPTER 1 


BINARY QUADRATIC FORMS 


1-1 Introduction. One of tlie subjects treated in elementarj’ 
number theory is the possibility of representing a positive integer as 
a sum of two squares.* The expression which is of interest 

for this problem is a special case of the general binary quadratic form 

fix, y) = ax^ + bxy + a/. (1) 

(This in turn is a special case of the n-ary ni-ic form, which is a 
homogeneous polynomial of degree m in n variables.) Systematic 
research in quadratic forms was begun by Gauss, and has since been 
extensively pursued. We shall not go very deeply into the subject, 
but prefer instead to develop general methods whose usefulness is not 
limited to the theory of quadratic forms, nor even to the theory of 
numbers. 

Suppose that in (1) we make the linear homogeneous substitution 

. = ox' + 0/, 
y = yx' + 6y', 


where a, y, and 8 are integers and D == a8 — ^y ^ 0. 
X and y' gives 




y , a 


Solving for 



so it is only in case D = ± 1 that to each integer pair x, y corresponds 
an integer pair x', y and conversely. We shall eventually suppose 

*See, for e.xample, LeVeque, Topics in Number Theory, vol. I, (Rciiding, 
Mass.: Addison-Wesley Publishing Company, Inc., 1956), Chapter 7. So 
much use will be made of the results obtained in this book that it will be 
referred to henceforth simply as Volume I. 

1 



2 


BINARY QUADRATIC FORMS 


(chap. 1 


that = +1, for reasons that will appear later. Then 

x' = 6x - 

, _u 

y = -^x + ay. 

Substituting (2) into (1), we have a new form in x and y', 

g{x\ y') = Ax'^ + Bx'y' + Cy'^ 

where 

A = aa^ + bay + cy^, 

B — 2aa0 + b{ad -f" 0y) + ^cyd, (5) 

C = + bps + cS^. 


If for suitable integral values of x and y we have/(x, y) = n, then, for 
the corresponding values of x' and y' determined by equation (4), 
< 7 ( 3 :', y) = n. 

It thus appears that, as far as questions of representation are con¬ 
cerned, it would be senseless duplication to consider f{x, y) and 
separately; every integer represented by / is also repre¬ 
sented by g, and conversely. This leads us to call / and g equivalent, 
and to write / ~ if one can be obtained from the other by a uni- 
modular linear substitution with integral coefficients, 


X = ax' + py, 

y = W A- ^y', 


aS — Py = 1. 



This in turn brings up one of the principal problems of this chapter: 
how to decide whether two given forms are equivalent. 

The substitution (2) is described quite adequately by specifying 
the coefficients a, P, y, and 6; that is, by writing the matrix 



This symbol does not represent a number, of course; it is simply a 
list of the coefficients of the substitution, in the order in which they 
occur in (2). However, we can give names to these matrices, and 
deduce certain of their properties from the corresponding properties 
of the associated substitutions. Thus if 



INTRODUCTION 


3 


1 - 1 ] 


then we shall say that M and M' are equal if and only if they cor¬ 
respond to the same substitution, that is, 

a = a, /? = |3', y = y', 6 = 5'. 


If for arbitrary M and M' we apply the corresponding substitutions 
successively, so that 


X = ax' ^y', 
y = yx &y', 


f f ff \ ff 

I = a x + y , 

/ t ff I ff 

y = y X 5y , 


we could accomplish the same thing by the single substitution 

X = {aa' -I- &y')x" -|- {a^' -f- 

y = {ya + hy')x" -i- (7 j3' + &b')y”. 


Thus, if by the product MM' of two matrices we mean the matrix of 
this latter substitution, we must define 

(a 0\(a' ^'\_/aa'-\-fiy' a^'^ ^6'\ 

V 7 \yc.' + &y' y^'-h56')' 


Thus the product has as element in the ith row and jth column, for 

each i and j, the sum of the products of the elements of the zth row of 

the first matrix with the corresponding elements of the jth column of 

the second matrix. Moreover, if the determinant of a matrix is defined 
as 



a 0 

y 5 


= a6 - 0y, 


it requires only a routine calculation to show that 


det M • det M' = det {MM'). 

It is to be noticed that, in general, MM' ^ M'M, although 

MiM'M") = {MM')M". 

Since the substitutions given by (2) and (3) are inverse to each 

other, it IS natural to call the matrix of (3) the inverse of the matrix M 

of (2), and to designate it by M-\ Then MM~^ = M~^M = I 
where * 



I IS called the identity matrix; it corresponds to the trivial substitu 



4 BINARY QUADRATIC FORMS (cHAP. 1 

tion X = x', y = y\ and has the property that MI = IM = M for 
every M. A square matrix has an inverse if and only if its determinant 
is different from zero. 

Finally, we designate by M the transpose of M, obtained by inter¬ 
changing rows and columns in M: 

= 9 ’ thenM = (“ ]) • 

The transpose of a product is the product of the transposes, in reverse 
order: 

(mF) = M'M. 

Also, the transpose of the inverse is equal to the inverse of the 
transpose: 

{M)~^ = W^. 

Matrices need not be square. Thus 

if X = (x y), then X = 

note, however, that nonsquare matrices have neither determinants 
nor inverses. 

The importance of this algebra of matrices to our present purpose 
lies in the fact that if 



then 

XFX = (X ,) ( “ Q = (ax + hl>y + cy) Q 
= {ax^ + hxy + cy^). 

Although it is a slight abuse of language, it is convenient and in the 
present context harmless to identify a one-by-one matrix with the 
element itself, so we write 

fix, y) = XFX. 

F is called the matrix of the form, and A = 4 • det F = 4ac — is 

called the discriminant of the form. 

In terms of matrices, the substitution equations (2) and (3) can 

be written as 




1-11 


INTRODUCTION 


5 

A' = X'JI and A'" = AAV“*, 
respectivelj'. Thus 

Jix,y) = XFX = {A'^i7)f(AT7) = iXTl)F{MX') 

= X'{}lFM)X', 

so that the matrix of 3 is G = MFM. (The reader might test his 
ability to manipulate matrices by showing that the last equation is in 
agreement with equations (5)). Multiplying both sides of the equa¬ 
tion G = MFM by M"^ on the left and on the right, we have 

= {M-*.i7)F(.VM-*) = F. 

If det M = 1. then also det M = 1, and 

det G = det {MFM) = det M • det F ■ det M = det F, 

so that the discriminant of a form is not changed by a unimodular 
substitution. 

In summary, a form with matrix F is equivalent to a form with 
matrix G if and only if there is a matrix M such that G — MFM and 
det 3/ = + 1 ; equivalent forms have the same discriminant and 
represent the same integers. 

The relation of “equivalence,” as used here, is an equivalence rela¬ 
tion in the technical sense.* For it is clear that 

(a) /-/: F = !FI; 

(b) / ~ ^ implies ~/: G = MFM implies F = M~^GM ~^; 

(c) f and g ^ k implies/ ~ ft: G = MFM and H = M'GM' 
implies H = M'MFMM' = {mW)F{MM'). 

Thus all the forms equivalent to a given one are equivalent to each 
other, and the set of all forms splits up into equivalence classes, any 
two elements in one class being equivalent, and elements from differ¬ 
ent classes being inequivalent. (The equivalence classes for the rela¬ 
tion of congruence (mod m) are simply the residue classes modulo m.) 
Just as we chose a system of representatives of the various residue 
classes modulo m, we would like to pick a system of representative 
forms, one from each class. It is the object of the next two sections to 

develop machinery by which such reduced forms can be obtained in a 
natural way. 


•See, for e.xaraple, Volume I, Section 3-1. 



6 


BINARY QUADRATIC FORMS 


(chap. 1 


PRODLEMS 

1. Give proofs of the following statements, for the case of two-by-two 
matrices: 

(a) for some M and M\ MM' M'M, 

(b) M{M'M") = {MM')M", 

(c) MM~^ = M-HI = I, IM = MI = M, 

(d) MM' = M'B, 

2. Verify directly, and also by matrix multiplication, that under the 
substitution 

X = 2x' dy', y = x' 2y', 

the form Fix, y)=3x^—7xy+4y^ goes into Gix',y') = 2x'^+3x'y'+y'^. 
Compute the inverse of the matrix of the substitution, and so carry G back 
into F. 

1-2 Groups. We say that a set G of elements o, b,..., which 
need not be numbers, forms a group with respect to a certain opera¬ 
tion (designated for the moment by the symbol "<»”), which combines 
two elements to form a third, if 

(a) for every a and b in G, a «• b is in G, 

(b) the operation is associative, so that a <> (b «• c) = (a o b) » c, for 
every a, b, and c in G, 

(c) there is an identity element e, such that aoe = coa = afor 
every a in G, 

(d) every a in G has an inverse a~^ in G, such that 

a o a~^ = o a = e. 

Perhaps the simplest example of a group is the set of all integers, 
under the operation addition. In this case, the number 0 is the iden¬ 
tity, since a-|-0 = 0-|-a = a, and the inverse of a is —a, since 
a -b (-a) = (-a) -|- a = 0. This group is infinite (i.e., has 
infinitely many elements), but the group consisting of the four 
numbers 1, f, —1, and — i, under the operation multiplication, is 
finite. If the operation consists of adding two integers and reducing 
the result to the least positive residue (modm), then the numbers 
1, 2, . . . , m form a group, with m as the identity. Instead of a com¬ 
plete residue system we could consider a reduced residue system 
(mod w); these numbers form a group M{m) in which the operation 



GROUPS 


7 


1-2) 


is ordinary multiplication followed by reduction (modm). This is 
not quite so obvious; the identity is clearly the number c s 1 (mod m ), 
but the existence of inverses depends on the fact that the congruence 
ar = 1 (mod m) is solvable if (a, w) = 1. Many of the results in the 
congruence theory which are obtained in beginning texts are simply 
special cases of general theorems about hnite groups; it might be of 
interest to examine this relationship briefly before proceeding. 

A subset Gi of the elements of a group G is said to form a subgroup 
of G if it also forms a group with respect to the operation of G. C’ondi- 
tion (b) is automatically satisfied in this case, so that one need only 
verify that (a) holds (which we express by saying that Gi is closed 
under the operation), that the identity is in Gi, and that the inverse 
of each element of Gi is in Gj. 

The number of elements in a finite group is called the order of the 
group; we now show that the order of a subgroup Gi divides the order 
of the group G. Suppose that a is an element of G which is not in Gi, 
and let aGi be the set of all “products” aog, where g runs through Gi. 
Then no element a <• j is in Gi, for if it were, the same would be true of 
aogog ^ = a. Also, if gi and g 2 are distinct elements of Gi, then 
aogi 7^ aog 2 , since otherwise a~^ oaog^ = © a o ^ 2 , or gi = ^ 2 - 

Now suppose that b in G is not in either Gj or oGj; then as before, no 
element of bGi is in Gj or aGi, and elements of bGi arising from dif¬ 
ferent elements of Gi are different. This process can be continued 
until every element of G is in precisely one of the sets Gj, aGi, 6 G 1 ,..., 
and each set contains exactly jn distinct elements, where m is the order 
of Gi. Clearly, if there are t such sets, then the order of G is ml, 
which is divisible by m. ’ 


U a is any element of G, then the powers of a (that is. a, = a o a, 
a = a o a o a,.. .) are all in G; we easUy see that if G is finite, these 
powers form a subgroup, whose order m is such that o'" = e. Hence, 
if the order of G is mf, then o'"' = c. For the group Mim) defined 
above, whose order is ^(?n), this statement reduces to Euler’s theorem 
wMch states that ^ l (modm) if (a, m) = 1 . Many of the 
other results of Volume I follow immediately from these remarks 
about groups. For example, Theorem 3-19, which says that if a 
belongs to t (mod m) then can be reworded in the languaee of 
groups to read, “The order of the cyclic subgroup generated by a 
(i.e., the group of powers of a) divides the order of the group.” A 



8 BINARY QUADRATIC FORMS [CHAP. 1 

primitive root of g is a generator of M {q) ; thus Theorem 4-11 is a 
statement of the fact that M ( 9 ) is cyclic (consists of the powers of 
a single element) if and only if 5 = 1 , 2, 4, 2p“. If a > 2,11/(2") 

has two generators, — 1 and 5; for example, modulo 16, the powers of 
5 are 5, 9, 13, and 1, and these numbers, together with their negatives, 
form a reduced residue system. 

At present we shall do no more with finite groups, but turn our 
attention instead to the much more complicated multiplicative group 
of all two-by-two matrices with integral entries and unit determinants. 
This infinite group, which will be designated here by F, is called the 
modular group. To show that F is a group, we verify properties (a) 
through (d) above. The system is obviously closed under multiplica¬ 
tion, since the determinant of a product is the product of the determi¬ 
nants of the factors. The associative property has already been 
verified. The identity element of F is I, as defined in (7). The in¬ 
verse of any element 

(:;) 

is 

(A -/). 

since 

(; !)(A 9 -'- 

The group F differs from the other examples mentioned in that it is 
noncommutative, since in general MM' 9 ^ M'M. (Abstractly, G is 
said to be a commutative or abelian group if a <> b = b o a ior every a 
and b in G.) 

1-3 The modular group. The properties of F could all be de¬ 
veloped by the use of algebra alone; we prefer instead to build up the 
theory with the help of a simple geometric interpretation. It is now 
convenient to reverse the roles of the accented and unaccented 
variables in the eiiuations ( 2 ); this new notation will be used through¬ 
out the discussion of the modular group, but the original system will 
be reverted to when quadratic forms are again considered. To keep 
matters straight, ( 2 ) will be termed a substitution, while the modified 
equations will be called a transformation. Putting z = xfy an 



1-3J the modular group 

z = X i]f , we get 

, az + ^ 

z = -- 

72 + 6 

So far nothing essential has been accomplished. The crucial point 
lies in allowing z to range over all complex numbers, rather than the 
real rationals to which it was formerly restricted. Then equation (8) 
can be regarded as defining a transformation or mapping of the com¬ 
plex z-plane into the z'-plane. Somewhat more than this can be said: 
if 



is in r, so that det M = 1, a simple calculation shows that the imagi¬ 
nary parts of z and z have the same sign. In other words, (8) maps 
the upper half of the z-plane (i.e., the region where the imaginary part 
of z is positive) into the upper half of the z'-plane, and the lower half 
into the lower half. Hereafter, we restrict attention to the upper half 
planes. 

It is convenient to identify the z- and z'-planes, and to think of (8) 
as sending each point 2 of the upper half V of the complex plane into 
another point z of U. We also identify the elements of r with the 
corresponding transformations (8), which has the effect of identifying 
the matrices 



In accordance with the earlier definition of equivalence, two points 
2 and 2 of V will bo called equivalent if one can be mapped into the 
other by a transformation of F. As usual, this assigns each point of U 
to an equivalence class; two elements of the same class are equivalent, 
and elements from different classes are ine(}uivalent. A region R o{ U 
is called a. fundammtal region if no two of its points are equivalent 
while every point of U is eciuivalent to a point of R; in other words 
R constitutes a complete system of representatives of the above equiv¬ 
alence classes. It would be more precise to refer to R as a funda¬ 
mental region of the group T, since two points may be equivalent with 
respect to one group of transformations but not with respect to 
another. For example, it is clear that a fundamental region R' of a 
subgroup r of r contains a fundamental region of r itself, if both 




10 BINARY QUADRATIC FORMS [cHAP. 1 

regions exist. For any point in U, being equivalent to some point of 
R under the transformations of V', is a joriiori equivalent to the same 
point of R under the transformations of the larger group F. It may 
not be true, however, that any two points of R' are inequivalent with 
respect to F. 

Theorem 1-1. The region R in U composed of all points z suck that 
—^ < Rez < 5 and either \z\ > 1 , or else \z\ = 1 and —^ < 
Re 2 < 0, is a fundamental region of F. (See Fig. 1-1.) 

Proof: First note that F has the subgroup Fo of all integral 
translations z' = z -j- For the 
associated matrix 

CO 

has determinant 1 , the identity 
transformation z' = z is in Fq, 
the inverse of z' = z 0 is z' = 
z — 0 and is in Fo, and the result 
of making two translations is 
again a translation. Fo is cyclic, 
being generated by 

2 ' = 2 + 1. (9) 

As a fundamental region of Fq we could choose any infinite strip in U 
of unit width, extending parallel to the imaginary axis from the real 
axis. We take the following one: 

RqI Im 2 > 0, — § < Re 2 < 5 . 

From the remark preceding the theorem, i?o must contain a funda¬ 
mental region of F if any exists. Rq is not itself a fundamental region 
of F, however, for the point i/2 of Ro is transformed into the point 2i 
of Ro by the transformation 

= ( 10 ) 

z 



Figure 1-1 


, ocz 0 
z = 


With each transformation 

T: 


72 + 5 



THE MODULAR GROUP 


11 


1-31 


with 7 5 ^ 0, there is associated the circle C{T): I 72 + ^1 - 1. 
center at - 5/7 and radius 1 I 7 I < 1. Now 



a = y -— — a 

72-1-5 


^7 — a5 
72 + 5 


1 

■ f 

72 -h 5 


so that C(T) is transformed by T into I 72 — a| — 1, which, by (3), is 
C(T“*)- More importantly, the exterior of CiT) goes into the 
interior of C{T~^). It is simple to deduce from this that no two points 
of the region R described in the theorem are equivalent. Certainly no 
point of R is mapped into another by an element of To- But if T is 
not in To, then 7 5 ^ 0, and since the interior of R is external to all the 
circles C{T) (inasmuch as they all have radii < 1 and are centered at 
real points), any interior point of R is mapped by T into an interior 
point of one of these circles, and hence into a point outside R. 

The arc A : \z\ = 1, -^ < Be 2 < 0, which forms part of the 
boundary of R, is also completely exterior to all the circles C{T) ex¬ 
cept Ul = 1 and l 2 + 11 = 1. The circle I 2 I = 1 is associated with 
transformations 



az-\-0 

■ f 

z 


and since the determinant must be 1 , = —1 and 


/ ^ 1^1 ___ 

2 = - = a-» \z -a|=r7’ 

z Z \z\ 

If 2 is a point of A, \z' — a\ = 1, and so z is not in R unless a = 0 or 
— 1. If a = 0 we have the transformation (10), which sends A onto 
the arc Iz] = l, 0 <Re 2 < 5 ; this arc has only the point i in common 
with R, and i goes into itself. (This means that i is equivalent to 
itself in two different ways: z' = z and z' = —l/z.) If a = —1, .A 
goes into the arc jz -h 1] = 1, —1 < Re z < — 5 ; these two arcs have 

just p = — J -f i\/3/2 in common, and p goes into itself. 

The circle jz -f- Ij = l is associated with transformations 


_ az -j- /3 _ 02 -b (g — 1) 


1 

z + r 


2+1 


2 + 1 


a 



12 

If \z\ = 1, then 


BINARY QUADRATIC FORMS 


(chap. 1 


1 ) 

— a 

\z' - (a - 1)1 = \z' - a|, 

He z' = a — 

and z' is not in R unless or = 0. Under the transformation 

/ 1 

z - - t 

z+ 1 

the arc A goes into the line segment Re z = — i < Im z < \/3/2; 
the arc and the segment have just p in common, and p goes into itself. 
We have thus shown that no two points of R are equivalent, and have 
incidentally obtained the following result, which will be useful later. 

Theorem 1-2. The point p = (— 1 + f\/3)/2 is mapped into 
itself by the three transformations 

t / 1 j f ^ ^ 

z = z, z - -» and z — -- 

1 + z 2 

and by no others. The point i is mapped into itself by the two trans¬ 
formations 

z = z and z --» 

z 

and by no others. Any point of R different from p and i is mapped 
into itself only by the identity transformation z = z. 

To complete the proof of Theorem 1-1, we must show that any 
point z in U is the image of a point in R under a transformation of F. 
We do this by finding a finite sequence of transformations such that if 
they are successively applied to z, the final point z is in R. Then the 
inverse of the product of these transformations maps z back into z. 

Designate by S the generator (9) of Fq, and by Tl the transforma¬ 
tion (10). Let z be a point of U not in R. Then for some integer ni, 
which may be positive, negative, or zero, zi = S'^^z = z -b ni is in /?o, 
the fundamental domain of Fq. If zi is in R, we are finished. 

| 2 i| = 1 but 0 < Rez < then Wzi is in R. If l 2 i| < L then 
Z 2 = R 2 i has modulus greater than 1. In fact, if zi = xi -f- iyi, 




1-3] 


THE MODULAR GROUP 


13 


, • 1 / - ^ 1 
22 = - r~, -r = ^2 + ^i/ 2 . ~ 2 5s -Tl <• 2 » 

•Tl + !/l 

SO that Im22 > Ini 2 i, and if i/i < then Im22 ^ 2 Im 2 i, since 
then + j/i“ < If 22 is in R, M-e are through. If not, there is a 
suitable exponent n 2 such that 23 = *S '*222 is in Rq, and Im 23 = Im 22 . 
If 23 is not in R, we can apply U' again, and get 24 = iriS" 2 U’ 5 ''i 2 . 
What we must show is that after finitely many steps, this process 
leads to a point in R. 

As long as yit < we will have t/k+i > 2*/*, if 2/t+i = Start¬ 

ing with a positive number (the imaginary part of z), a finite number 
of doublings will produce a number larger than 5 . So suppose that 
we have obtained a 2 * = i* + tt/t such that 

y* > i xi.^-hyk^<l. (11) 

Then 

1 1 . 

Zk+l — -» Zk+2 --r R, 

Zk Zk 


where n is so determined that — J < Xii ;+2 < This gives 

nxjfc - 1 + inyk 


so that 


If 1 r| > 2 , 


Zk+2 


= 


Xk + iyk 

{nxk - 1 )^ + n^yk^ 


\^k+2\ 


Xk + yk^ 


while if In] = 1, the hypothetical inequality \zk+ 2 \' < 1 gives 

which says that Zk is farther from the origin than from the point 
z = n, which is false from the first inequality of (11). Finally, if 
R = 0, then l 2 *+ 2 p^ = l/kifcl^ > 1. Hence in all cases, l 2 jt+ 2 l > 1, and 
2 ^ Zk +2 < 2 - If Zk +2 is still not in R (which may happen if 
\zk+ 2 \ — 1) then Wzk +2 is in R, and the proof is complete. 

Moreover, the proof has shown that S and II' are generators of T, 
since every transformation of T can be written in the form 

5it"*W...irS2”2ir5,">. 



14 


BINARY QUADRATIC FORMS 


(chap. 1 



A geometric representation of the group F is given in Fig. 1-2. 
Here we have considered the region 7? as the image of itself under the 
identity transformation /, and have put R = /?{/). The congruent 
unshaded region to the left of R is then R(S~^), in the sense that if a 
point z' of it is equivalent to a point z of R, then z' = S~^z. 4 o put it 
differently, S”* maps R onto this region, just as W maps 7? onto the 
unshaded region ROV) immediately below R. The semicircular arcs 
are portions of the circles 0(7’); infinitely many of them terminate in 
each rational point on tlie real axis. If the drawing and shading Mere 
completed, any shaded or unshaded region could be taken as a funda¬ 
mental region. Each fundamental region or “double triangle is 
bounded by three arcs, with vertex angles of 0, jr/3, and t/ 3. e 
heavy arc inside each region indicates the portion of the boun ary 

which is to be included in the region. 


PROBLEMS 

1. Find the point in 7? to which the point 

8 + 6t 

is equivalent, by the metliod used in the proof of Tl.eorem 1-1. Do you 

see an easier way, for tliis particular number? , ■ u, i:„p, 

2. If the term “circle” is used in the broad sense to mclude straight Imes, 

show that the transformations of F send circles into circ es. '' about 
circumstances are the in,age circles actually lines? M hat can be sa.d about 

such a line if, for every point z on the original cnc e. m 2 _ 
















1-41 


REDUCED DEFINITE FORMS 


15 


3. Verify that, in the notation of the text, (SIV)^ - 1. 

4. Show that the transformations 




Q' 


1 


z 


generate a group of six elements (the group of anharmonic ratios) of which 
a fundamental region is the set of points z such that 


Im 2 > 0, \z\ > 1. and [2 - 11 > 1, 

together with half the boundary of this region, leading from (1 + i\/3)/2 
to infinity in one direction. Sketch the analog of Fig. 1-2 for this group. 
[Note that the transformations of the group do not carry U into itself.l 

1-4 Reduced definite forms. With the help of the facts now 
known about the modular group, we can deal with the question of how 
to decide whether two given binary quadratic forms are equivalent. 
We must consider separately the essentially different cases in which 
the discriminant A = 4ac — of the form ax^ + bxy cy~ is 
positive or negative. (We put aside the degenerate case in which 
A = 0.) If A > 0, the form is called definite, otherwise indefinite. The 
definite forms can be further classified as positive or negative, according 
as a > 0 or a < 0. The reason for this terminology is that the 
polynomial az^ bz c associated with a definite form has nonreal 
zeros, so that the form 



+ 5 - + C 

y 



has the same sign as a for every choice of x and y e.xcept j = j/ = 0, 
while an indefinite form can have values of either sign. We shall first 
consider definite forms, restricting our attention to positive forms, 
since the treatment of negative forms is almost identical. 

Since the matrix of a form is a little cumbersome, we shall use the 
symbol [a, b, c] to designate the form «.r" + bxy + ct/. It is to be 
clearly understood that this is simply an abbreviation, and cannot be 
combined with like symbols as matrices can. 

Let us consider then a positive definite form f{x, y) = [n, b, c], in 
which A > 0, a > 0. and c > 0. For the time being, we do not re- 
(luire that a, b, and c be integers. Then the quadratic polynomial 



16 BINARY QUADRATIC FORMS 

/ ( 2 ) = az^ bz c has zeros 


(chap. 1 


-6 ± V-A 



of these, we single out the one with positive imaginary part and call 
it o). Thus to the form (a, b, c] there corresponds the point w in the 
upper half plane. Conversely, each point in U corresponds to exactly 
one form of discriminant A. For if 20 is such a point, and Zo is its 
complex conjugate, then there is a unique number x such that the 
quadratic expression x{z — Zo){z — zo) has discriminant A. Hence if 
we consider only forms of given discriminant A (which is all that is 
required in the equivalence problem, since e(}uivalent forms have the 
same discriminant), there is a one-to-one correspondence between 
points of U and forms of that discriminant. Moreover, if the points 
oji and C 02 are associated with the forms/i and /2 of discriminant A, 
and if a transformation T of F carries/i into/ 2 , it carries wi into 
W 2 . It therefore makes no difference whether one speaks of the form/ 
or the point w, as far as the operations of F are concerned. We call w 
the representative of /. 

It should now be clear how to decide whether or not two forms are 
equivalent. If they do not have the same discriminant, they are not 
equivalent. If they have, they are equivalent if and only if their 
representatives are equivalent, and this can be decided by trans¬ 
forming the representatives into the fundamental region R, vhere 
they must be identical to be equivalent. This leads us to define a 
reduced form as one whose representative is in R; reduced forms are 
equivalent if and only if they are identical, and each class of equivalent 

forms contains exactly one reduced form. 


Since 


O) 


-b \/-A 
2a 





b- -b A 


c 



f 


0 , is in R if and only if < 
(>/a = 1 and — 2 ^ a < 0. 
i.s reduced if and only if either 

-a < b < a < c 


-b, 2a < and either c.'a > 1, or 
Simplifying, we have that [a, b, c\ 


or 


0 < 6 < a = c. 


(13) 



1-51 


REDUCTION OF DEFINITE FORMS 


17 


PROBLEM 

Prove the assertion, made in the text, that if wi and u>2 are the repre¬ 
sentatives of the forms/i and/2 with discriminant A, and if a T in P carries 
/1 into /2, it carries wi into 0)2. 

1-5 Reduction of definite forms. A given form can be trans¬ 
formed into its equivalent reduced form by exactly the process used 
in the proof that R isa, fundamental region of T. That is, by a trans¬ 
lation iS"!, w can be changed into u>', where — 2 < Re < 2 » 
is not in R, we begin afresh with TTo)', etc. The translation z' = 
z Til must be such that 





or 


6 = 2ani rj, 


where —a < rj < a. The transformation z' = z ni has matrix 



but we must now revert to the inverse transformation z = z' — ni to 
utilize the results of Section 1-1, which were based on the equations 
(2). If we put 

"'^ = (0 = 

then, as we saw earlier, M carries a form with matrix F into one with 
matrix 

G = MFM, 

80 that in this case, if we let the result of the first translation be 
= XFiX, then 

Similarly, if F2 is the result of applying the inversion W to Fi then 

'• - (; (-.;) ■ 

A simple calculation shows that, if A = (a, b, c], then/2 = [c, -b, a]. 



18 


BINAKY QUADRATIC FORMS (CHAP. 1 

Thus we have the following algorithm for reducing / = [a, h, c]: find 
Til and ri such that ’ 

b = 2ani — n, —a < ri < a, 
and compute/i = [ai, bi, ci], where 

“Hi T)’ 

so that/i = [a, 6 — 2ani, n-j^a — bui + c]. If/| is not reduced, put 

ft = ki, —hi, fli] = [a 2 , b 2 , C 2 ]. If /2 is not reduced, repeat the entire 
procedure. For some k, /* will be reduced. 

The discussion thus far has been valid for positive definite forms 
with arbitrary real coefficients. For the remainder of this section and 
the next, we consider only integral forms, that is, those with integral 
coefficients. 

Theorem 1 - 3 . There are only finitely many classes of integral 

definite forms of given discriminant. 

Proof: To each class there belongs just one reduced form [a, b, c] 
satisfying the conditions (13). Since 

4o^ < 4ac = A + 6^ < A + a^, 

the inequality 0 < a < VA/3 holds for each reduced form, so that 
there are only finitely many possible values of a for fixed A. Since 
\b\ < a, the same is true of 6, and for each pair a, b there is at most 
one integer c such that 4ac — b^ = A. 

If, for example, A = 3, then 0 < o < 1, so that a = 1 and hence 
6 = 0 or 1; from this it is easily seen that the only integral reduced 
form of discriminant 3 is + xy + y^. There is also just one class of 
discriminant 4, and its reduced form is + y^. 

problems 

1. Find all reduced integral definite forms of discriminant A < 20. 

2. Find the reduced form equivalent to [117, 103, 100). 

1-0 Representations by definite forms. If a transformation of T 
leaves a quadratic form unchanged, it is called an aulomorph of the 
form. Since an automorph also leaves the representative of the form 
unchanged, and is the only kind of transformation which does, the 
following theorem is an easy consequence of Theorem 1-2. 



]_6) representations by definite forms 

Theorem 1-4. The only aulomorpks of a{x^ + i/) are 


19 


X = ±x', 


and 


y = ±y , 

The only automorphs of a{x^ + xy + y^) are 


r . ' 
x = ±y , 

y = =Fx'. 


X = ifcx', 

y = ^y'> 


and 


x = ±x ± y, 

y = Tx'. 


[x = ^y\ 
y = ztx' ± y, 

Any positive reduced form distinct from these two has only the auto- 
morphs 

' , f 

X = ±x , 

y = rty'. 


An integer n is said to be properly representable by an integral form 
[a, h, c] of discriminant A if there are relatively prime integers a, y 
such that aa^ + bay + cy^ = n. For such a, y, there are fi-^d 5o 
such that aSo — 0oy = 1, and, in fact, 

a8 — ^y = 1 , 


if, for some integer t, 


/3 = /9o + 


5 = 5o + yt. 

If we make the substitution 

X = ax' + fiy', 
y = yx + 5i/, 

then (a, 6, c] goes into a form (n, m, 1] with first coefficient n, by 
equations (5). Also by (5), 

m = 2 aa (^0 + “0 + b{aSQ + ayt + j3o7 4- ciyt) + 2c7(5o + yt) 

~ 2aa^Q + b{a8o + + 2cy&o + 2n(, 

so that m is determined modulo 2n. Choose m so that 0 < m < 2n; 

then t is fixed, 0 and 3 are unique, and I is determined by the dis¬ 
criminant : 

4ln — = A. 

Theorem 1 - 5 . Let a, y be a proper representation of n> t}by the 
integral form [a, 6, c] of discriniinant A. Then there are unique 
integers ^ and 5 such that aS — ^y = 1, and the substitution (14) 



20 


BINARY QUADRATIC FORMS 


[chap. 1 


replaces [a, b, c] by the equivalent form [n, m, ?], where 0 < m < 2n, 
m satisfies the congruence ’ 

= -A (mod 4n), ( 15 ) 

and 

+ A 


1 = 


An 


(16) 


Thus to each proper representation of n by [a, b, c] there corresponds 
a unique form which has first coefficient n and satisfies certain auxil¬ 
iary conditions. The appropriate converse, which we now consider, 
gives the number of such representations, and provides an effective 
method of finding them. If m is a solution of (15) and 0 < m < 2n, 
then 4n w is also a root, and 2n ^ An — m An. We shall refer 
to m as a minimum root if 0 < m < 2n. 


Theorem 1-6. Let w{f) be the number of autOTnorphs off = fa,6,c], 
an integral positive form of discriminant A. Let nbea positive integer. 
Corresponding to each minimum root m of the congruence (15), 
determine I by equation (16). Then the number of proper representa¬ 
tions of n by f is w(f) times the number of such forms [n, m, 1] which 
are equivalent to f. In particular, if there is only one class of dis¬ 
criminant A, the number of proper representations is w(f) times the 
number of minimum roots of (15). 


Proof: Suppose that g = [n, m, 1] is a form of the type described in 
the theorem. Then if / is not equivalent to g, Theorem 1-5 shows 
that there is no representation of n by/corresponding to the minimum 
root m. If / is equivalent to g, let T be the matrix of a substitution 
which replaces / by g, and let A be the matrix of an automorph of f. 
Then 

G = TFT and F = AFA, 
so 

(AT)F{AT) = TAFAT = TFT = G, 

so that AT is also the matrix of a substitution which carries/into g. 
Conversely, if for any U, 

G = UFU, 


then UFU = TFT, and 


= t^Wfut-\ 



21 


1_6] representations by definite forms 

so that C/T"* is the matrix A of an automorph of /, and U = AT. 
Hence there are exactly w{f) substitutions which replace / by g. 



and / has only two automorphs (see Theorem 1-4), then 



and a, 7 and —a,—y give two distinct proper representations, since 
(a, 7 ) = 1 and therefore a and 7 are not both zero. If / ~ a(x^ + y^), 
then 



and the representations a, 7 ; —a,— 7 ; — 7 , 0 ; and 7 ,—a are again 
distinct. If /~ a(x* + xy -f y^), then AT is one of the matrices 


Vy V’ \a + Y 3 + S/ \ -“ -»)’ 


and these also lead to distinct representations. 

If there is only one class of discriminant A, then / and g are neces¬ 
sarily equivalent, so that all minimum roots of (15) lead to repre¬ 
sentations. The proof is complete. 


In the case of primilive forms (those having relatively prime 
coefficients), w(J) depends only on A: uj(/) = 6,4, or 2 according 
as A is 3, 4, or larger than 4. If /(x, y) = x^ + y^, so that A = 4, 
then m must be even to satisfy (15). Let m = 2mi ; then mi^ s 
— 1 (modn), and 0 < m < 2 n means 0 < mi < n, so that the 
number of proper representations of n as a sum of two squares is four 
times the number of solutions of the congruence s — 1 (mod n). 
This result was obtained in Theorem 7-5, Volume I, by quite different 
methods. 


PROBLEMS 


1. Find 5, m, I of Theorem 1-5 corresponding to the proper representa¬ 
tion 3, 5 of 118 by (2, -5, 7]. 



22 BINARY QUADRATIC FORMS fcHAP. 1 

2. What is the number of proper representations of 28 by (1, 1, 2)? 
Find them. 

3. Use Theorem 1-6 to discuss the proper representability of 10 by 

[2. 1,2]. 

4. Show that every prime congruent to 1 or 3 (mod 8) has a unique 
proper representation in the form + 2y^ with x > 0, y > 0. .More 
generally, show that if n is the product of powers of r such primes, then n 
has 2''^^ proper representations in this form. 

1-7 Indefinite forms. The behavior of indefinite binary forms is 
remarkably different from that of forms with positive discriminant. 
For example, any integral indefinite form whose discriminant is not 
the negative of a square has infinitely many automorphs, and there¬ 
fore represents any integer in infinitely many ways if it represents it 
at all. Moreover, there seems to be no natural way to pick out a 
unique reduced form in each equivalence class, although we shall find 
a finite set of canonical forms in the case of integral forms. 

Hereafter we restrict attention to integral forms [a, 6, c), and put 

D = — A = 6^ — 4ac > 0. 

If D is a square, then [a, 6, c] factors into two linear factors with 
integral coefficients. We dismiss this degenerate case, and hereafter 
require that Z) be a nonsquare integer. Finally, for the sake of 
simplicity we consider only the case that [a, b, c] is primitive. We see 
from equations (5) (proof by contradiction) that any form [A, B, C] 
equivalent to a primitive form is again primitive. 

As before, there is associated with [a, 6, c] the quadratic equation 

+ hz + c = 0, 

which this time has two real roots, say 

-b-\- \/D -b - VD 

- Ta -’ 2a 

It is easily verified that a transformation of the modular group which 
sends [a, b, c] into [a, b', c] sends wi into w/ and U 2 into (^ 2 , and 
never wi into 0 , 2 - call wi the^irs/ root, and 0.2 the second root. 

As C. Hermite noticed, there is also associated with 

[a. b, c] = a(x - wiy)(x - o> 2 y) 



1-7] IXDEFINITR FORMS 

a family of definite forms 


y) = ^ (■*■ “ + 9- U “ W2y)", 


2t 



where < > 0 is a real parameter. A simple calculation shows that the 
discriminant of V) is D, for every I > 0. Reverting to the 
quotient variable z = x/y, we find the zeros of ipt{z) to be those 
points Zi such that 


l (j, _ = -<(2, - <-2)^ 


or 



The transformation z' = iz rotates the plane about the origin through 
the angle ?r/2; it follows from the last equation that the line segment 
connecting Zt with wi is perpendicular to the segment connecting Zi 
with W 2 , and hence that Zt lies on the circle having as diameter the 
segment which connects wi and <a 2 - If, as usual, we take that root 
which has positive imaginary part as the representative of then we 
have associated with [a, 6, c) the semicircle 2 in f/ connecting 
and 0 ) 2 . As t varies from 0"^ to «>, Zt describes 2 from wi to W 2 ; 've 
can think of the semicircle as oriented with this sense, inasmuch as the 
orientation is preserved under transformations of T. This orientation 
is necessary, since othenvise there would be no way of distinguishing 
the (usually inequivalent) forms [a, 6, c] and [—a, —b, —c]. The 
form is now completely described by specifying its oriented semicircle 
2 and its discriminant —D. 

An indefinite form / will be called reduced if the associated semi¬ 
circle intersects the fundamental region R considered earlier. Thus 
/ is reduced if and only if the definite form is reduced for some i. 
The fact that any indefinite form is equivalent to a reduced form is an 
immediate consequence of the fact that^i, for example, is equivalent 
to a reduced definite form: the transformation which carries into 
a reduced form also carries / into a reduced form. The difficulty lies 
in showing that each indefinite integral form is equivalent to only 
finitely many reduced forms. To do this, we must first examine an 
important subgroup of T which is intimately connected with /. 



24 


BINARY QUADRATIC FORMS 


— ^ AW A A AVf A* ^AVAUO [chap, 1 

1-8 The automorphs of indefinite forms. A transformation of r 

which leaves [a, b, c] unchanged also leaves wi and W 2 fixed. The fixed 
points of the transformation 


, _ ctz + 0 


are those points w such that 


0 ) = 


72 + 5 


ao) + /3 
7w + 6 


or 


yoj^ + (5 — aW — 0 = 0. 


(17) 


Suppose that the roots of this equation are wi and W 2 - These num¬ 
bers are also the roots of the equation + bw + c = 0; since 
(a, b, c) = 1, it follows that for some integer u, 


Putting 6 + a = we have 


a = 


7 = au, 

5 — a = bu, 


(18) 

—0 = cu. 


(19) 

bu 

t + bu 

(20) 

— , 8 = 

2 ’ 


where t and u are such that 


1 = aS — 0y = 


t^-b^u^ 0 l^-Du^ 

-^-h acur =-:- 


or 

^2 _ j)y2 = 4 ^ ( 21 ) 

Conversely, if t and u are solutions of (21), and a, 0, y, and 3 are 
determined by equations (18) through (20), then (17) reduces to 
u{aJ^ + bo. + c) = 0 and aS - 0y = 1. This proves 

Theorem 1-7. The set of all automorphs of the primitive indefinite 
form [a, b, c] is given by the set of all matrices 

(: 9 

with a, 0, 7 , and 5 determined by equations (18), (19), and (20), 
where t and u run over the integral solutions of the Pell equation (21). 



25 


l_g] THE AUTOMORPHS OF INDEFINITE FORMS 

Originally, automorphs were defined as substitutions giving z in 
terms of z , while we have here used the inverse transformation giving 
z’ in terms of 2 . But if F = AFA, then F = so that the 

inverse of an automorph is also an automorph, and the set of all 
automorphs coincides with the set of all inverse automorphs. This 
fact has much greater significance than in its application above. For 
since the product of two automorphs is again an automorph, the 
automorphs of / form a subgroup of T, which we shall designate by 
Ta (/). (The elements of Ta if) will be taken sometimes as transforma¬ 
tions and sometimes as their matrices. The ambiguity resulting from 
the fact that the matrices A and ~A correspond to different substitu¬ 
tions in the form but to the same fractional transformation of T 
should cause no difficulty if the reader remains aware of it.) Using 
well-known properties of the solutions of Pell's equation,* PaC/) 
can be characterized as follows. 


Theorem 1-8. r.4(/) is the infinite cyclic group generated by the 

matrix 



(fo “ 
\ awo 


-cuo \ . 
5(/o + ^“o)/ ' 


if A is any automorph of f, then A = U" for some integer n, positive, 
negative or zero. Here Iq, Uq is the minimal positive solution of equa¬ 
tion (21). 

(The ambiguity mentioned above is exemplified here: every trans¬ 
formation 2 ' = (02 + 0)/{yz + 5) of rA(/) can be made to have 
matrix F", but the set of all substitutions which leave / fixed is given 
by X = ±X'F".) 


Proof: According to Theorem 1-7, VAif) is the group of matrices 


/^{t — fm) —cu \ 

\ au 5 (f + 6u)/’ 


P - Du2 = 4, 


so it is to be shown that each of these matrices is a power of V. 
If we put 


2(^0 + uon/D)" = ^{tn + Ur^/D) 


•Pell’s equation is discussed in Volume I, Chapter 8. The minimal 
positive solution is described in Theorem 8-7. 



BINARY QUADRATIC FORMS 


(chap. I 


for each n, then 

2(^n+l + Wn+iV^) = \{tn + U„'\/D)(;0 + Uq^/D) 

= \{tQtn + DUqU^) + \{toUn + tnUo)\/D, 


SO that 


yn+l 


“ 2(^0^n + Du^Un), ^ 

Now suppose that 

yn+l _ / 2(^*1 ~ ~CUn 

\ aWn i (k + bw 

an assumption which is correct for n = 0. Then 

yn +2 _ yyn +1 _ / 2(^0 huo) CUq 

V auo i(fo + 6i 


^n+l — 2(k^ + ^O^n). 


-CUn \ 
n + hUn)J 


auo 


— CUo 

§(fo + ^Uq) 


V f iik CUj^ X 

\ aUn i{tn-hhUn)J 

_ ( 2(^n+l ~ ^Wn+l) ^c{UQtn 

“f* Wn^o) 2(^n+l ' 


hi'^ok + Unh) \ 
i(Wl + ^^n+l)/ 


~ (Wl “ hWn+i) —CU„+i \ ^ 

\ au„+i iik+l+^n+l)/ 

and by induction, F” is of the supposed form for all n> 0. Similarly, 
it can be shown that 


yn ^ /^(k-l - bUn-l) 

\ aUn^l 


— CU„_ 


5(in —1 + 


n -1 \ 


SO that F" is also of the supposed form for all n < 0. Hence the 
matrix corresponding to any solution of equation (21) is a power of F, 
and the theorem is proved. 

As usual, it is useful to know a fundamental region of r^f/). 

Theorem 1-9, Suppose that the perpendicular bisector Cq of the 
segment I joining a»i and u >2 is mapped by V into the circle C\. Then 
Cl does not intersect Co, and the {infinite) region between them, 
together with Ci, wi, and W 2 , is a fundamental region of r^(/). 

Proof: If the arbitrary transformation 

z' = T(z) = (02 + P)I{'YZ + 5) 



27 


1_8) THE AUTOMORPHS OF INDEFINITE FORMS 

has the distinct fixed points zi and 23 , then by dividing z ~ Zi by 
z' — Z 2 we get 

z ' — 2i 02 + — 21(72 + S) _ (g - 721)2 + (g — 

2^— 22 02 + — 22(72 + S) (« “ 722)2 + (^ — 522) 

a — 721 2 + (g — fei)/(g — 721) 

0 — 722 2 + (/3 - SZ2)/(a — 722) 

O - 721 2 - r~^( 2 i) 0-721 2 - 21 

0 — 722 2 — ^”^(22) 0 — 722 2 — 22 

In the case at hand, T is the transformation 

y . 2' = ^(^0 ~ ^^0^^ ~ <^0 

auo2 + 5(^0 + ^Mo) 
with fixed points ui and U 2 , and 

a - 721 i(fo - huo) - auo(-^> + y/D)/2a <0 — VD rq 

a “ 722 ^(<0 — huo) — auo(—b — y/D)l2a ta + VDi/q 


We put 



to - VD ^^o 
/o + V^ Uq 


and have for V the representation 





It follows that V" is the same transformation with K replaced by /C"; 
this could be used to give a second proof of Theorem 1-8. 

By its definition, X is a real number between 0 and 1. Since the 
perpendicular bisector Co of the segment I joining ui and «2 has the 
equation \z — wi| = [2 — < 1 ) 2 !, T" transforms it into \z' — coi] = 
K’^lz' — W 2 I, as we see by taking absolute values in (22). If we put 
2 = X + iy, the last equation becomes 

Cn: (x - 0 , 1)2 + = K^-{ix - 0 , 2 )^ + y^), n | 0 

and it is a matter of simple analytic geometry to prove the following 
assertions: for positive n, C« is a circle with its center on the real 
axis, on the extension through 0,1 of /; it contains 0,1 in its interior; 



^ BINARY QUADRATIC FORAIS [CHAP. 1 

it lies entirely on that side of Co on which wj lies; and its radius 
approaches zero as n increases. For negative n, the circles C„ lie on 
the other side of Cq, contain 0 ) 2 , and close down on W 2 as |n| increases. 
Some of these circles are shown in Fig. 1-3. The lightly shaded 

region ^a(I), which is the region described in Theorem 1-9, is the 
set of points z such that 


K < 


Z — 0)1 

z — 0)2 


< h 


and it is clearly transformed by V into the set Ra(V) of points z 
such that 


< 


Z “ CO] 
Z ~ 0)2 




Figure 1-3 





REDUCTION OP INDEnNlTE FORMS 


29 


1-91 


which is the region between C\ and C 2 , including C 2 . In general, 
r" transforms ^^( 1 ) into the region between C„ and Cn+i, 

including C„+i. Since the entire plane, excluding wi and < 1 ) 2 , is covered 
in this fashion, and no point is in two such regions, any one of them, 
together with ui and W 2 , is a fundamental region of r^(/), and the 
proof is complete. 


We are concerned here only with the upper half-plane U ; relative 
to this, a fundamental region of (/) is that portion of any one of the 
above regions which lies in U. 

In the next section it will be convenient to have slightly more 
freedom in choosing a fundamental region of r .4 (/). We get this by 
noticing that, instead of beginning with the line Co, we could have 
started with any member of the family of circles 





For fixed c > 0, a fundamental region /?. 4 (c, 1) would then be the 
ring between the circle (23), which we might call Co(c), and its trans¬ 
form 


Ci(c): 




the argument given above carries through with no change except for 
the introduction of a factor c in certain equations. Such a region is 
shown heavily shaded in Fig. 1-3. 


1-9 Reduction of indefinite forms. The semicircle 2 representing 
the form / is the upper half of the circle given parametrically by 



0 < f < «. 


The generating automorph V, given by equation (22), changes 2 
into the upper half of the circle 


(? - ~ (tVKf, 0<<<=c, 

\Z — U>2/ ^ ’ 

which is the same circle with a different parameter. In other words, 
2 is transformed into itself by V, and hence by any element of F^ (/), 



BINARY QUADRATIC FORMS (CHAP. 1 

in the sense that each point of 2 goes into some other point of 2, 
although no points of 2 remain fixed except wi and wg- In fact, that 
arc of 2 which lies in a fundamental region /?a(c, 1) is mapped by F" 
onto the arc of 2 which lies in the region Ra(c, F"), so that these 
various arcs are equivalent with respect to rA(/). Hence they are 
also equivalent with respect to the larger group T. 

Now imagine 2 drawn in Fig. 1-3. For suitable choice of c, the 
circle Co(c) defined in the last section intersects 2 at a point on the 
boundary of one of the transforms of li, and this is then also true of 
the equivalent point which is the intersection of Ci (c) and 2. The 
arc between these two points is thus broken up by the boundaries of 
the double triangles in Fig. 1-2 into a finite number, say /i, of smaller 
arcs. If these short arcs are transformed back into R by suitable 
operations of F, then every point of 2 is equivalent to some point on 
each of these new arcs; in other words, there are precisely n elements 
of r which transform 2 into a semicircle intersecting R. Hence 
there are precisely n reduced forms equivalent to /. 

Theorem 1-10. There are only finitely many reduced forme in any 

equivalence class of integral 'primitive indefinite forms. 

Using the definition of reduced form, it is simple to characterize 
reduced forms in terms of their coefficients. For clearly [a, 6, cl is 
reduced if and only if one or both of the points p and — p^ are inside 
the semicircular region bounded by 2, or if p is on 2. The points 
below 2 in [/ are the points z = x + iy such that 

a{a{x^ d" y^) + + c) <0. 

Since p and — p^ have the coordinates 



we have that / is reduced if and only if either 

a(2a dz 6 + 2c) < 0 or 2a - 6 + 2c = 0. (24) 

To find the set of reduced forms of the class containing a given form 
[a, b, c], the procedure outlined for definite forms may first be used 

to reduce y) = a/2{x — ojiij)^ + a/2{x — wz//)^ = ax^ + bxy + 
(b^ D)!f^/4a; the transformation which reduces •Pi(x,y) also 

reduces [a, b, c], say to [a,, 6i, Ci]. Thus the semicircle 2, repre.sent- 


uNivtHStrr 



REDUCTION OF INDEFINITE FORMS 


31 


1-9] 


ing (ai, 6 i, Cl] intersects the fundamental region R of F, either in an 
arc or in the single point p. Starting from a point on Zj in R, move 
along Zj in the direction in which it is oriented. At the point at 
which Zi leaves R, it enters one of the regions 


R{S-^), R{S-HV), RiWSW), R{WS), R{W), 
R{WS-^), 7?(Sir), or R(S), 


since these are the only regions adjacent to R (cf. Fig. 1-2). If it 
enters R{Ti), then Ti"' sends Zi into a new semicircle Z 2 (associated 
with ( 02 , 62 , C 2 ]) which has an arc in R, and this arc is the image 
under Ti“‘ of the portion of Zi in R{Ti). The same argument can 
now be applied to Z 2 , leading to a Z 3 (associated with [fla, 63, C3I) 
which has an arc in R, and this arc is the image under T 2 ~^T\~^ of 
the arc of Zi next encountered in moving along Zi in the positive 
direction. If the process is repeated m times, Zi and (aj, bi, cj] will 
recur. 


It is rather the exceptional case that Z passes through p or —p^. 
If it does not, the array of possible transformations listed above 
simplifies: the only Ts to consider are then S, S~^, and IF. For 
example, consider the reduced form [2, -4, —1], where wi = 

(2 + \/6)/2, 0.2 = (2 - VQ),2. Zi goes from R to /?(ir). so we 
make the inversion = W, or 2 = - 1 / 2 '. This replaces [a, b, c] 
by [c, -b, a], so here [ag, 62 , C 2 I = [-1, 4, 2). Z 2 goes from R to 
R(S), so we make the translation S~\ or 2 = 2' 1 . In general 

this replaces [a, 6 , c] by [a. 2 a + 6 , a -f- h + c], so here [ag, 63 . cg] = 
(-1, 2, 5]. Z 3 also goes from R to R{S), and we get [ 04 , 64 , C 4 ] = 
(- 1 , 0 , 6 ]. Z 4 also goes from/? toand [ 05 , 65 , cs) = [- 1 , - 2 , 5). 
A final application of S~^ gives [qq, b^, ce] = [- 1 , -4. 2]. Since’ Ze 
goes from R to /?(ir), we invert, to get ( 07 , 67 , c?) = (2, 4, - 1 ]. Z 7 

goes from/? to/?(S“‘), so we must make the translation S :2 = 2' — 1 . 

In general, this replaces [a, h, c] by (a, b - 2a, a - 6 -|- c], so here 
(^ 8 . h, Csl = (2, 0, —3). A second application of S gives (og, bg, cg] = 
“^1 = [<*i. Cl], and we have the complete set of reduced 
forms for this class. If the algorithm were repeated indefinitely, a 
periodic sequence of forms would arise; it is therefore meaningful’to 
speak of the period of reduced forms. 

The following principle is useful in these calculations: If after a 
translation the inequality (24) is correct for just one choice of sign, 



32 BINARY QUADRATIC FORMS (CHAP. 1 

the next step is an inversion, while if it holds for both signs, the next 
step is a repetition of the translation. (5 is never followed by S~^, 
nor IT by >r.) The reason for this should become clear upon looking 
back at the derivation of (24). 

Theorem 1-11. There are only finitely many classes of integral 
indefinite forms of given discriminant. 

Proof: First consider the primitive forms; for them it suffices to 
show that there are only finitely many reduced forms of given dis¬ 
criminant A — ~D. From (24) we get 

2 a^ zt ab < —2ac, 
so 

4a2 ±2ab + b^ <b^ - 4ac = D. 

But for each choice of sign, 4a^ ± 2ab -j- b^ is positive definite; it 
therefore represents only positive integers unless a = 6 = 0, and by 
Theorem 1-6, each of the integers 1, 2, . .., Z) is represented in only 
finitely many ways. Hence there are only finitely many choices for 
a and b, and for each choice, c is fixed by the requirement 6^=Z)-f-4ac. 
There are therefore only finitely many reduced forms, and hence only 
finitely many periods, and so only finitely many classes. 

If a class contains an imprimitive form, say with (a, b, c) = d, then 
every form in that class also has divisor d, so that the class consists of 
the elements of a class of primitive forms with smaller Z), each multi¬ 
plied by d. There are only finitely many such classes. 

PROBLEMS 

1. Find the period of reduced forms belonging to the class of 

x^-\-7xy-\- 7y». 

2. Show that Theorem 1-7 remains correct if the word “indefinite” is 

omitted, that is, if Z) < 0 (cf. Theorem 1-4). 

3. Show that there is just one class of primitive forms with D = 20, and 

one class of imprimitive forms. 

1-10 Representations. The discussion occurring between The^ 
rems 1-4 and 1-6 made no use of the definiteness of the form; it is 
therefore equally applicable to indefinite forms. Thus Theorem 1-6 
can be recast as follows. 



REPRESENTATIONS 


33 


1-10) 


Theorem 1-12. Let f = [a, b, c] be a primitive integral indefinite 
form of discriminant A, where D = is not a square. Let n be an 
integer. Corresponding to each minimum root m of the congruence 
(15), determine I by (16). If none of the forms [n, m, is equivalent 
tofy there are no proper representations of n by f. If at least one of the 
new forms is equivalent to f, there are infinitely many proper repre- 
sentalions of n by f; they are given by the first columns of all the ma~ 
trices AT, where A can be any automorph ±r" of f, and T is any of a 
set of matrices which replace f by the various equivalent forms [n, m, l\, 
each form being obtained from just one T. 


PROBLEMS 

1. Discuss the proper representation of 13 by (1, 3, —1). 

2. Show that the odd numbers properly represented by x* + 4xy — y* 
are those of the form 

5* n p."', 

where € = 0 or 1, r > 0, and p< s ±l (mod 10) for 1 < t < r (cf. Prob¬ 
lem 3, Section 1-10). 



CHAPTER 2 


ALGEBRAIC NUMBERS 

2-1 Introduction. With a few exceptions, the theory developed 
up to this point, both in this volume and in the preceding intro¬ 
ductory volume, has been self-contained, in the sense that the prob¬ 
lems, which had to do with the ordinary integers, were solved 
without going outside this system. When considering the distri¬ 
bution of primes and the theory of quadratic forms, we made use 
of the real and complex numbers, but not in an intrinsically arith¬ 
metic fashion. In the investigation of the representability of an 
integer as a sum of squares,* however, we had occasion to consider 
the arithmetic structure of the set of Gaussian integers, and to apply 
this to. a problem involving ^ordinary integers. During the last 
century, it has been found that many problems in rational arithmetic 
are treated most naturally by introducing larger sets of “integers” 
and deducing, from the structure of the extended system, information 
about the ordinary integers. Of course, as soon as a mathematician 
begins to work in a new medium, to use a metaphor from art, he 
finds interesting questions which have little or nothing to do with the 
original problem. In the present case, this tendency was instrumental 
in the development of modern abstract algebra, a large portion of 
which has only a tenuous connection with number theory. 

From the point of view of this text, general algebraic theory must 
take second place, the primary object being to give the reader an 
appreciation of the power afforded by the method, as well as a knowl¬ 
edge of some of the basic results in the subject. For this reason, the 
formulation will be kept as concrete as possible; there will be no 
striving for generality or abstractness for their own sakes. The 
treatment is self-contained, except for the following two theorems, 
whose proofs can be found, for example, in L. E. Dickson, First 
Course in the Theory of Fqualions (New York: John Wiley & Sons, 
Inc., 1921), pp. 130-131 and 124-125, re.spectively. 

*See, for exiimplc, Volume I, Chapter 7. 

34 



2-11 


INTRODUCTION 


35 


The product D 1 D 2 of two deterininaiils of the same order is another 
delerminant of that order, whose element in row i and column j is the 
sum of the products of the elements of the Uh row of Di and the corre¬ 
sponding elements of the ‘}lh row of D^. 

Symmetric Function Theorem. Any polynomial P{xi, . .. , Xn) 
symmetric in Xi, . . ., Xn and of degree g in each, is equal to a poly¬ 
nomial of total degree g, with integral coefficients, in the elementary 
syyntnetric functions 

T.^1, • • •» ^ 1^2 ' • • 

and the coefficients of P{xi, .. . , x„). In partiailar, any symmetric 
polynomial with integral coefficients is equal to a polynomial in the 
elementary symmetric functions with integral coefficients. 

If P is a polynomial in the roots of an equation f{x) = Oof degree n 
and leading coefficient 1, and if P is symmetric in « — 1 of the roots, 
then P is equal to a polynomial, with integral coefficients, in the re¬ 
maining root and the coefficients of f{x) and P. 

We shall also have occasion to use the so-called Fundamental 
Theorem of algebra; this basic assertion is proved in the remainder 
of the section. 

Fundamental Theorem of Algebra. A polynomial f{z) = 
+ • • ■ + On having complex coefficients and positive degree, has a 
complex zero. (It follows immediately that it has exactly n complex 
zeros, in the sense that there are complex numbers .. ., such that 

f{z) = ao(z - ii) • • • (2 - in).) 

Proof: Since the truth of the theorem depends on the structure of 
the complex numbers, it is necessary to use some properties of these 
numbers. If the entire theory of functions of a complex variable is 
assumed, the proof is very easy indeed: an analytic function has as 
many zeros as poles, and a polynomial has a pole at infinity, so it 
must have at least one zero. If less than this is assumed, it is reason¬ 
able to ask that as little be assumed as possible. The proof to be 
given uses the fact that a real-valued continuous function of two real 
variables has a minimum value in any closed domain, and it assumes 

familiarity with the symbol Va, where a is real. (If DeMoivre’s 

theorem were used, to give meaning to ^ for complex a, the proof 
would be slightly simpler.) 



36 


ALGEBRAIC NUMBERS 


(chap. 2 


With the second assumption, the quadratic formula provides a 
proof when n = 2 and the coefficients are real. To solve a quadratic 
equation with nonreal coefficients, it may be necessary to extract the 
square root of a nonreal number. Let the number be a + bi. Then 
the equation a + bi = (x + iy)^ gives 


a = and b = 2xy, 


or 

4x* — 4ax^ — 6^ = 0, 


and we can take 


X 



a + Vo^ -f- 



Before treating the general case, note first that we can write 
/(x + iy) = G(x, y) + iH{x, y), where G and H are polynomials in 
the real variables x and y, with real coefficients. It follows from the 
continuity of G and H throughout the xy-plane that \f{z)\ is contin¬ 
uous throughout the complex z-plane, where 2 = x + iy. Moreover, 
forn > 0 and 0 (which we henceforth assume), we have 

lim 1 /( 2 ) 1 = <= 0 . 


For if max (|ao|,..., Iflnl) = A., then 

|/(z)| > |ao2"| - (Iflnl + Wn-iA H-h 

forW>max(gf,l). 

Since \}(z)\ is continuous, it assumes a minimum value at some point 
in any closed circular disk with center at 0, and since |/(s)| becomes 
infinite with \z\, the disk can be chosen so large that this minimum 

occurs at an interior point We must show that — 0. 

We now proceed by induction: suppose that every polynoimal of 
degree less than n, with complex coefficients, has a complex zero, and 
that / is of degree n and |/(z)| assumes its nonzero minimum at (. 



2-1) INTRODUCTION 37 

Suppose that /({) = M, and put 

g{z) = ^ ^ 1 + 6tz + • • • + 

then l^(z)l > 1 for all z. Define k as the smallest index such that 
hk ^ 0 , so that 

g{z) = 1 + 6 * 2 * H-h 6 nz", k <n. 

First consider the case that k < n. By the induction hypothesis, 
the equation 

1 + 6 * 2 * = 0 

has a root. Let tj be this root, and put 2 = St?, where 0 < 5 < 1. 
Then 

fffSrj) = 1 + 6 *S*t?* + 6*+i5*+S*+i + • • • + 6n6"T?” 

= 1 - a* + (6 *+itj*+^ + ■ ■ • + 6,t?’‘5''-*-')6*+^ 

Now if I 6 jl < B for fc < j < n, then 

+ • • ■ + h^Tj-a"-*-*! 

< B(i + |Tj|)"a*+>{i + 6 + . ■. -H a"-*-^) 

< Bn(l + 1 t?|)’*S*+* = Ca*+^ 

Thus 

lff(S^)l < 1 - 5* + CS*+* = 1 - a*(i - C 6 ), 

and for 0 < a < 1/C, Iff(aTj)] < 1 . This contradicts the assumption 
that 1 is the minimum of |g( 2 )l; hence M - 0. 

If fc = n, then g{z) = 1 + 6 „ 2 ''. If n is even, then the equation 

is solvable, by the induction hypothesis, and any root of it is also a 
root of j( 2 )= 0 . Hence we can suppose that n is odd. Put 6„ = c+d{. 

If c 0, we put 2 = -6 sgn c (that is, 2 = a or -6, according as 
c < 0 or c > 0), and obtain 

In- (c + di) 2"|2 = |l - |c|a" - 6"disgncp 

= 1 - 2|c|a" -h (c2 + 

this last expression is again smaller than 1 for 5 sufficiently small, and 
we have the same contradiction as before. 



38 


ALGEBRAIC NUMBERS [CHAP. 2 

If c = 0, then d 9 ^ 0] moreover, a sign can be chosen so that 
(±f)" = i- Then ii z ~ sgn d, we have 

|l + id{:hib sgn d)”| = |l - ld[5”|, 

and this is smaller than 1 for 5 sufficiently small. The proof is 
complete. 

2-2 Pol 5 moniials and algebraic numbers. We begin by making 
the following definitions. 

(a) R is the set of all rational numbers. 

(b) iRfx] consists of R together with all polynomials in x with 
rational coefficients, the coefficient of the highest power of x being 
different from zero. 

(c) If a polynomial p{x) is in deg p means the exponent 
of the highest power of x occurring in vix), if this is positive; if 
a 7 »^ 0 is in /?, deg a = 0, while if a = 0, deg a is not defined. 

(d) A polynomial p{x) in i2[a:] is said to be monic if the leading 
coefficient is 1 . 

(e) If pi(a:) and p 2 (a:) are in I2[x], we say that P 2 {^) divides piix) 

(in symbols, \ tbe phrase does not divide is indicated by 

the symbol “|”) if there is a 9 ( 2 :) in i 2 [x] such that pi (x) = p 2 (^)?(^)- 
Under this definition, an element of R different from zero divides 
every element of /2[x]. The nonzero elements of R are therefore called 
units of /?(x]. 

(f) An element p(x) is said to be irreducihU in R[x] if it cannot be 
written as the product of two nonunit elements of ^[x]. 

By formalizing the ordinary process of dividing one polynomial 
by another, it is not hard to show that if pi (x) and P 2 (x) are in R[x], 
and P 2 (x) is not zero, then there exists a unique pair of elements ^(x) 
and r(x) of i?[x] such that 

Pi(x) = P 2 (x)g(x) + r(x), degr < deg P 2 or r(x) = 0. 

This analog of the division theorem for integers* forms the basis for a 
Euclidean algorithm, by means of which a greatest common divisor 
(pi(x),P2(x)) can be determined; the development is entirely 
parallel to that for the integers, and leads to the following theorems. 

*See, for example, Volume I, Theorem 1-1. 



2-2] POLYNOMIALS AND ALGEBRAIC NUMBERS 39 

Theorem 2-1. Given two elements-p\{x), of /?[x], not both 

zero, there is another element d(x) which is unique to within a unit 
factor and which has the following properties: 

(a) d(x)!p,(x) and rf(x)lp 2 ( 2 ;)- 

(b) If di(x) is in ^[x], and divides both p\{x) and P 2 (^)> 
di (x)ld(x). 

U {piM, PzM) = d{x), there are elements and q 2 {x) of 

i?(x) such that 

Pi{^)gii^) + P 2 {x)q 2 {x) = d{x). 

Theorem 2-2. Any nonzero element of /?[x] can be factored into a 
product of irreducible elements of i?[x], and this factorization is unique 
except for the order of factors and the presence of units. 

There is no loss in generality, and some gain in simplicity, in sup¬ 
posing that the various polynomials with which we deal are monic, 
since any polynomial can be made monic by multiplication by a unit. 
In this case the second part of Theorem 2-2 could be restated to read: 
The factorization of a monic polynomial into irreducible monic elements 
is unique except for the order of factors. 

We now consider the zeros of the polynomials of 7?[x], or, what is 
the same thing, the roots of equations p(x) = 0. If a is a root of the 
equation 

p(x) = x'* -H rix”-^ -I- r 2 x "-2 _|-= 0. ( 1 ) 

where p(x) is in ^Ix] and n > 0 , then a is called an algebraic number; 
if p(x) is irreducible in i?lx], a is said to be of degree n. (The rational 
numbers are algebraic numbers, since if r is in i?, x — r = 0 has the 
root X = r. As algebraic numbers they are of degree 1, although when 
considered as elements of J?[x] the nonzero rational numbers were 
pven degree 0.) An algebraic number a is a zero of a unique monic 
irreducible polynomial in R[x], called the defining polynomial of a. 
For if p(x) is not irreducible, it can be factored uniquely into irre^ 
ducible monic factors, and a must be a zero of one of the factors. 
Hence a satisfies some irreducible equation, i.e., an equation in which 
the left side is irreducible in i?lx]. If a satisfies two such equations 
say p(x) = 0 and ^(x) = 0 , then it also satisfies the equation d(x) = 0 * 

where d(x) = (p(x),5(x)). For if 

p(x)si(x) -H 5 (x)s 2 (x) = d(x), 



40 ALGEBRAIC NUMBERS (CHAP. 2 

then 

d(a) = Si (a) • 0 + S2(a) -0 = 0. 

But since p{x) and q(x) are irreducible, their monic gcd is either 1 or 
p{x). Since 1 0, (p(x), ?(x)) = p{x), and p{x) = q{x). 

If p(x) in equation (1) is the defining polynomial of a, its n zeros 
ai = a, 02, , On are called the conjugates of a. Except for an 
alternation in signs, the numbers ri, r 2 ,..., r„ are simply the 
elementary symmetric functions of ai = o, ^ 2 ,..., a„: 

ri = — Z)ai = — (a + 0:2 + • • • 4 - an), 

r2 = Haia2 = oof2 + * ‘ ‘ 4- 


r„ = (—l)”aa 2 • ‘ ’ ocn. 

As is the case here, we shall frequently use a Greek letter, both with 
and without the subscript 1, to denote a single algebraic number. 

Theorem 2-3. The sum, difference, and product of two algebraic 
numbers are algebraic numbers. The quotient of two algebraic numbers 
is an algebraic number if the denominator is not zero. 

Proof: Suppose that a = ai and j3 = have defining polynomials 

p{x) = x" 4- rix”“^ 4- ■ • • 4- r„ = (x - ai)(x - aa) • • • (x - a„), 
q{x) = x”* 4- six"*“^ 4- + Sm= {x- /3i)(x - ^ 2 ) * • * 

respectively. Let ti, T 2 , • - •, Tnn. be the numbers obtained by 
adding an a.- and a 0j, in all possible ways. Then the polynomial 
i/(x) = (X - Ti)(x - 72 ) • ■ • - 7n.) has, as coefficients, sym¬ 

metric polynomials in the a.- and with integral coefficients. Let 
one such coefficient be ((ai, • • •, am • • • > 

polynomial in the a.- it is equal to a polynomial in r,,... , r„, whose 
coefficients are themselves polynomials m , with integral 

coefficients. These last polynomials are symmetric in 
they are therefore integral combinations of Si,..., and conse¬ 
quently are rational numbers. Thus the coefficients of gix) are 
rational numbers, and a 4- ^ is an algebraic number. The same proof 
applies for a • ^ and « - with obvious changes in the definition of 

• • • I Tnm* 



POLYNOMIALS AND ALGEBRAIC NUMBERS 


41 


2-21 

If a is algebraic and different from zero, so is 1/a, 
the polynomial 

r„x” + -b • • • + TiX + 1 


for the zeros of 


are the reciprocals of those of 

x'* + rix"“* + ••• + /■«, 


and Trt 0. Thus the assertion that a ''d is algebraic is a consequence 
of the fact that a • - is algebraic. 


The properties of the set of all algebraic numbers mentioned in 
Theorem 2-3 are shared by many sets of importance in mathematics; 
so many in fact that the name field has been reserved to describe such 
sets. Technically, a field F is a set of two or more elements a, 6, , 

together with an equivalence relation (which we designate by an 
equals sign) and two operations (which we designate by the symbols 
“-b” and such that the following relations hold: 

(a) For any a and 6 in F, either a = b or a 7^ b. If a = 6, then 
a + c = 6 -b c and a • c = 6 • c, for every c in F. 

(b) The elements form a commutative group with respect to the 
operation the identity element being designated by “0”. In 
other words, if a, b, and c are in F, then a + 6 is in F, a -b h = 6 -b a. 
a -f (6 + c) = (a -b 6) -b c, there is an element —a in F such that 
a -b (-a) = (-a) -b a = 0, and a-b0 = 0-ba = a. 

(c) The elements with 0 omitted (which we might call F*) form a 
commutative group with respect to the operation the identity 
element being designated by “1”. 

(d) Multiplication is distributive with respect to addition; that is, 
0 • (6 -b c) = a ■ 6 -b a • c for every a, b, and c in F. 

As long as one is working with a set of real or comple.x numbers, and 
ordinary multiplication, addition, and equality, one can show that 
the set forms a field just by showing that if a and 6 are in the set, so 
are a ± 6, ab, and a/b if 6 0; the other requirements are auto¬ 

matically fulfilled. Thus Theorem 2-3 is just the assertion that the 
set of all algebraic numbers is a field. Other familiar examples of 
fields are the set of all rational numbers, the set of all real numbers, 
and the set of all complex numbers. (The integers, on the other hand’ 
do not form a field, since only the elements ± 1 have inverses, under 
multipUcation, in the system.) In fact, every field composed of 



42 


ALGEBRAIC NUMBERS 


(chap. 2 


complex numbers together with the ordinary operations of addition 
and multiplication, contains the field R of rational numbers as a 
subfield. There are, however, fields with only finitely many elements. 
An example of such a field is the set of numbers 0, 1,..., p — 1 with 
the operations of addition and multiplication modulo p; in this case, 
a + 6 is that element c such that a h = c (mod p); a • 6 is that 
element d such that a - b = d (mod p); —a is 0 or p — a, according 
as a is 0 or not 0; if a 0, a~^ is that element / such that a • / s 
1 (mod p). 

The field of all algebraic numbers will play no role in the present 
discussion. We consider instead certain subfields of it, called algebraic 
number fields, described in the next theorem. 

Let ^ be an algebraic number, of degree n > 1, whose defining 
polynomial is p(x) as given in equation (1), and whose conjugates are 

t?, l?2, • • • I 'dn- 


Theorem 2-4. The set of all numbers of the form 


qii^) ^ 

92 (t?) 



where 9 i(^) ond q 2 {x) are in /?(x] and 92 ('^) ^ afield, which 
will be denoted by R{d). Every element of R{d) can be expressed 
uniquely in the form 

« = flo + + * ■ • + flu—1*^” ^ 


where ao, Oi, . . . , a„_i are in R. 

Proof: The first part is clear, since the sum, difference, product and 
quotient of rational functions are again rational functions. 

Since q 2 (.^) ^ 0 and p(x) is irreducible, q 2 (^) and p(x) are rela¬ 
tively prime, and for some t{x) and s(i) in ff[xl, 

((x)p(x) -b s(3:)92(^) = 


This gives s(t?)^ 2 (*^) = 


„ = 21^ = sWgiW. 

92 (t?) 

a polynomial in d. Since p(t?) = 0, 


It follows that every positive power of d can 


■ - r«. 

be written as a poly 



2-2) POLYNOMIALS AND ALGEBRAIC NUMBERS 43 

nomial in of degree n — 1 or less. The same is therefore true of 
every element a. If there were two different representations of a as 
polynomials in i? of degree n — 1 or less and with rational coefficients, 
their difference would be a polynomial of degree n — 1 or less which 
vanishes for x = i?, which is impossible. 

If a is an element of the field described in Theorem 2-4, and 

then the numbers 

a ~ a, a” = ¥’(^ 2 ), • • • , 

are called the field conjugates of a. (They may not lie in the field 
described in Theorem 2-4.) Every field conjugate of a is also a con¬ 
jugate of a in the earlier sense, for if a has the defining equation 
g{x) = 0, then g{<p{x)) vanishes for x = i?, so that p(x)l5f(v9(x)) and 
^(v’Cdjt)) = = 0- The converse is also true, as the following 

theorem shows. 

Theorem 2-5. The set of field conjugates of an clement a of Ii{d) is 
either identical vnth the set of conjugates of o, or consists of several 
copies of the set of conjugates of a. {Hence deg a|deg t?.) The poly¬ 
nomial whose zeros are the field conjugates of a is a power of the defining 
polynomial of a; if it is equal to the defining polynomial, then 

R{a) = R{0). 

Proof: Form the field polynomial for a: 

fix) = (x - a')(x - a") • • • (x - 

Its coefficients are symmetric polynomials in the a^^'^’s, and are thei*e- 
fore symmetric polynomials in di,..., and so are rational num¬ 
bers. Factor/(x) into its monic irreducible factors in R[x], say 

fix) = /i(x) -/ 2 (x) • • •, 

andlet/i(x) be a factor which vanishes for x = a. Then fi{*p{d)) = 0, 
so p(x)|/i(^(x)), and /i(x) vanishes at a, a", . . . , If these 

are distinct, /i(x) is of degree n, and fix) is irreducible. If they are 
not, let a, a", .,., be a maximal distinct set of a’s. Then/ 2 (x) 
vanishes for some so /i(x)l/ 2 (x); since / 2 (x) is irreducible, 
fzix) = cfiix), and c = 1 since /i(x) and / 2 (x) are monic. If there 



ALGEBRAIC NUMBERS 


44 


(chap. 2 


are other factors of/(x), the argument can be repeated. Eventually, 
we find that 

m = (/,(x))"/*. 

Since the zeros of /i (x), which is the defining polynomial of a, are the 
conjugates of a, those of/(x) (that is, the field conjugates) consist of 
n/l copies of the set of conjugates of a. 

Now suppose that/i(x) =/(x). Define 

r ^ t?2 

<p(x) = /(X) -7 ■ 

Lx — O' 


+ 


+ ••• + 


X — a 


Jn _ 1 


so thatvj(x) is a polynomial of degree n — 1 with rational coefficients. 
Since 

^(a) = t?(a — a") ’ • ■ (a — = t?/ (a), 

we have that the number 


d = 


r'(a) 


is in /i(a), so that /?(»?) is a subfield of R(a), and i?(a) = 

The last assertion of the theorem shows that if one field /?(«) is 
a proper subfield of a second field R(^), then deg a < degi>. For if 
deg a — degi?, then the field polynomial of a with respect to R{^) is 

irreducible, so that/i(x) = /(x), and R{a) = /?{*?)• . d/-q\ 

The field is called an algebraic number field; we say that R{<f ) 

is obtained by adjoining d to R, and call R{d) a simpU alge^aic 
extension of R, of degree n. This same field can be by 

adjoining various other numbers to R-, for example «(Zd) - 
If an element « of RW is such that R{a) = ff(d), then a .s called a 
primitive element of fl(d). It is clear that the degrees ol nny t^ 
^mitive elements are the same, and both are equal to the degree 

"'Thertt'of course, no reason why the process of adjunction cannot 
be repeated; one can start from fi(d) and adjoin an algebraic number 
, to it by taking all rational functions 

elements of K(d). This new field is denoted by R{d)M, or more 
simply by R{d, rj). 

Theorem 2-6. If d and v are algebraic numbers the 
, to Rid) gives the same field Rid, n) as the adjunction of d to Riv). 



2_2] POLYNOMIALS AND ALGEBRAIC NUMBERS 45 

There exists an algebraic number f such that Ri^, tj) is identical with 

Proof: The first part is clear, since both R{d, v) and R{v, i>) are 
identical with the field consisting of all numbers of the form 


v) ^ 

92(1?, »j) 


v) ^ 0> 


where qi{x,y) and q 2 {^>y) are polynomials in two variables with 
rational coefficients. 

If 1 } is an element of R(0), then R{^, v) = «(»?)> since a rational 
function of a rational function is again a rational function. Assume 
then that t? and v do not lie in the fields R{v) and R{0) respectively. 
Let their defining polynomials be pi(x) and p 2 (x), and let their 
conjugates be tJi,. . . , and rn,. . . , ijm, respectively. Let a and b 
be rational numbers, and let f = fi, . . . , fnm be all expressions of the 
form adj + brj*. Since the conjugates of t? are distinct, as are the 
conjugates of n, there is only a finite set of ratios a/6 for which some 
two of the f’s are equal, and we choose a and 6 so that a/6 is not in 
this set. Furthermore, we order the ff so that f = a«? + bij- 
Now put 

fix) = (i - fOfx - fa) ‘ • (-r - frim)- 

This polynomial has no multiple zeros, and its coefficients, being 
symmetric in the t^’s and tj’s, are rational. We show that RiO, rj) = 
It is clear that every element of Rit) is in R{<9, tj). Suppose 
on the other hand that p is in Ri^, n]), and that 


gi(«?. v) ^ 
g 2 {^, v) 


q 2 {^, v) ^ 0. 


Then we can define the numbers p = Pi,..., Pnm by the equation 


^ qii^j, Vk) 

92(<>j, Vk) 

where the same subscripts appear on and rj in the definition of p,- 
as in the definition of f,-, for t = 1,2,..., nm. Now put 



P2 

I - f2 



X 



I 



4G 


ALGEBRAIC NUMBERS 


(chap. 2 


by the Synunetric Function Theorem, the coefficients of F{x) are 
rational. If i > 1, the polynomial 


fix) 


Pi 


X - f,- 


= Piix - r) • • • (x - ti-l){x - f,+,) • • • (x - Um) 


vanishes for x = f, and from the representation 


we have 


Since 


this gives 


fix) 


X - r 


= Pix - h)-' ix - fnm) 


Fin == p(r - ^2) it~^nm). 

fit) = (f - i-2) ••• (f - tn^)^0, 


P = 


Fit) 

fit)' 


and p is in Rit). 


PROBLEMS 

1. Prove Eisenslein's irreducihility criterion: a polynomial fix) = 
oo + a\x + • • • + OnX” with integral coefficients cannot be written as a 
product of two or more polynomials with integral coefficients and positive 
degrees, if there is a prime p such that 

p\an, p\ai if i < n, and pH’flo- 

[Hint: Suppose that there is such a p, but that/(x) = gix)h(x), where 
g(x) = 6o + 6ix + • • • + brx^, hix) = Co + cix + • • • + c.i'. It follows 
that p divides exactly one of ho and Co—say Let the first coefficient 

in gix) not divisible by p, and deduce a contradiction from the expression 
for a, in terms of the 6’s and c’s.l As we shall see later (Theorem 2-21), 
irreducihility over the set of polynomials with integral coefficients implies 

irreducihility over /?[xl. Use this fact in Problem 2. 

2. Show that the following polynomials are irreducible over B[xl. 

(a) x" — p, p a prime. 

(b) x^'- + xP-* H_+ X -f 1. [Hint: Replace x by x + 1-1 

(c) x’ + 3x* + 4. 

3. Show that RiV2, Vz) is identical with RiV2 + Vp, and find a m- 
tional function r(x) with rational coefficients such thatr(V2 + V3) = V2. 



2-3) 


ALGEBRAIC INTEGERS 


47 


2-3 Algebraic integers. If the defining (monic) polynomial of an 
algebraic number d has integral coefficients, d is said to be an algebraic 
integer. This is a direct extension of the notion of ordinary or rational 
integers, which are the zeros of monic linear polynomials with integral 
coefficients. Hereafter we shall designate by Z the set of all rational 
integers. 


Theorem 2-7. The sum, difference, and -product of two algebraic 
integers are again algebraic integers. 

The proof follows the lines of the proof of Theorem 2-3. 

Theorem 2-8. If a is a zero of a monic polynomial with coefficients 
in Z, then a is an algebraic integer. 

Proof: Suppose that f{x) = + • • • + a^r is the polynomial, 

and that p(a:) = i" + -f • • • + r„ is the defining polynomial 

of a. Let 6o be the lcm of the denominators of the reduced fractions 
• • • » ^n, so that 6op(x) = ~ “b ’ • * + has 

relatively prime rational integral coefficients. Then q(x) divides/(x), 

the coefficients in the quotient polynomial being rational, and we can 
write 

fix) ^ cgjx) 
gix) c' ' 

where c and c &re so chosen that (?(x) has relatively prime coefficients 
in Z. Thus cf(x) = cg{x)g{x), and the coefficients of the product 
g(x)gix) are relatively prime.* Since this is also true of the coefficients 
of/(x) (for/(x) is monic), we conclude that c = c\ Comparing the 
coefficients of x" in the equation f(x) = g{x)q{x), we see that 6o|l, 
and hence 6o = ±1, which was to be proved. 

Theorem 2-9. If a is a root of an equation 

fix) = X" + /3ix'‘-* + . .. -j- = 0, 

tn which fiu • • •, are algebraic integers, then a is an aloebraic 
integer. 


neldj^), of degree m, say. We can use the sets of field conjugates 

♦The re^er may prove this simple fact himself, or refer to the remark 
preceding Theorem 3-14, Volume I. . r co me remark 



48 


ALGEBRAIC NUMBERS 


[chap. 2 


to form polynomials 

f2(x) = x" + + • - • + , 

fm{x) = X" + ^,t->x'>-i + ■ • . + 

The product/(x)/ 2 (x) • • -fmix) has rational integral coefficients and 
is monic; by Theorem 2-8, a is an algebraic integer. 

The set of integers in a fixed algebraic number field R{i}) is also 
closed under addition, subtraction, and multiplication. We shall 
designate this set by and call it the integral domain of the field. 
In particular, ^[1] = Z is the set of rational integers. 

Theorem 2-10. If t? is an algebraic number, there exists some 
rational integer a ^ 0 such that ad is an algebraic integer. If d 
satisfies an equation + ■ • • + = 0, in which /3o, . • • , are 

algebraic integers, then 0od is an algebraic integer. 

Proof: Let the defining equation of d be 

p(x) = x" + rix"”* H-h = 0, 

and let the lcm of the denominators of the fractions rj.r„ be o. 

Then the polynomial 

a"p = x” + arix"“* + • * • + 

has integral coefficients and is monic and irreducible; its zeros ad, 
ai? 2 ,. . •, adn are therefore integers. The proof of the second part, 
using Theorem 2-9, is similar. 

Since R{d) and R{ad) are identical for o 5 ^ 0 in Z, any algebraic 
number field can be considered as the result of adjoining an algebraic 
integer to R. 

If d is an integer, so are its conjugates • • • i The same is 

therefore true of its field conjugates. 

If a is any element of the field R{d) of degree n, the product 
aa” • ■ • of all the field conjugates of a is called the norm of a, 
and denoted by Na (a more complete notation would be N/i{,»a). 

Theorem 2-11. The norm of an algebraic integer is a rational 
integer. 



ALGEBRAIC INTEGERS 


49 


2-3) 


Proof: If a has the defining equation 

X"* + H- 1 - Sm, 

then the norm of a (in any given R{^) containing a) is a power of 
by Theorem 2-5. 

Theorem 2-12. If a and 0 are elements of Rid), then 


N(Qd) = Na-N^. 


Proof: Put 

a = Oo + Uii? + • • ■ + On—id’* ^ 
0 = bo bid + • • • + 6n—1'?" 



Then in the product a0, powers of d higher than the (n — l)th can 
be reduced using the equation 

=. _ 0i{rid^-^ + • • • + r„) (4) 

derived from the defining equation of d. Also and can be 
obtained from (3) by replacing d by t?*, and in the product 
higher powers of dk can be reduced by using (4) with d replaced by t?*. 
Hence the field conjugates (a/3)', (a^)",. . . , ioc0)^”^ of a0 are simply 
a/3, aV', . . . , a<'*>/3t'*\ Thus 

Na^ = ia0y ia0)" • • - ia0y^^ 

= a'a" • • • a^’^^0'0” • • • /3‘"> = Na ■ N^. 


Now let a, , V be n elements of Rid), with field conjugates 
a^*\ 0^^\ ..., where fc = 1, 2,..., n. The number 


A(a, /3, . . . , v) = 


a a 


ff 


a 


jf 


& 0 ... 0 


in) 


(n) 


// 

V V 


Xn) 


is called the discriminant of a, 0,..., v. Its value is independent of 
the order of rows or of columns. 


Theorem 2-13. If a, 0, ... ,v are in /?(«?), then A(a, /S,,.., 
is a rational integer. 



50 


ALGEBRAIC NUMBERS 


[chap. 2 


Proof: If we take the row-by-column product, we have 

(n) 


A(a, . . . ,v) = 


a 


a 


,(n) 


a 


a 


(n) 


in) 


C? + 


av -|- 


+ ... OP 4* * • • + 


(n),.(n) 


H- o'"^p 


(n)..(n) 


+ . . . + (,(") 2 ) 


Just as in the proof of Theorem 2-12, 

a/3 + a"/3" + • • ■ + a'">/3<"> = a|3 + (a/3)" + • • • + (a^)'"’, 

and the sum of the field conjugates of an integer is itself a rational 
integer, by analogy with the proof of Theorem 2-11. Hence, the 
number A(o, /3,..., p) can be written as a determinant with rational 
integral entries, and so is a rational integer. 

The numbers 1, i?,..., are said to form a basis of in the 
sense that every element of R(i}) can be expressed in a unique way as 
a linear combination of these numbers, with coefficients in R (cf. 
Theorem 2-4). We now examine the possibility of finding a basis for 
; that is, a set of elements of such that every element of 
can be expressed in a unique way as a linear combination of them, 
the coefficients in this case being in Z. To emphasize the distinction 
between these two kinds of bases, the second is sometimes called an 
integral basis. Every integral basis is a basis of as is imme¬ 
diately seen from Theorem 2-10, but the converse is false. 

If 0 ) 1 , ... , un is to be an integral basis, then for any p in the 

equation 

p = XiO>i + * * * “b ^n<^n> 


and therefore also the equations 

-f- • * • + A: = 2, . .. , n 

must hold for some rational integers Xi,..., a:„. If A (o)i,..., Wn) ^ 0, 
this system of equations can be solved, giving each x.- as the quotient 
of determinants, the determinant in each denominator being a 
square root of A(ai,..., It seems plausible that the smaUer 

lAfwi . .. , w„)l, the better the chance of obtaimng rational integral 




51 


2_3] algebraic integers 

x,-. Hence, if an integral basis always exists, the next theorem ought 
to be true. 

Theorem 2-14. If wi, W2, . .. , are any n integers of for 
which lA(wi, W 2 , • • ■ t smallest possible value different 
from zero, then wi, .. . , wnform a basis of /?[^]. 


Proof: Write 

n —1 

0)i — ^ i — 1, 2, . . • , Tl 

i-0 



where the are in R. Then 



and this can be factored: 


n “I 

L 


>-o 


n —I 

L 


i-o 



n —I 


y-o 


n —I 


E anj^n 


i »0 



= (detk,|)2A(l, (6) 

where t?, t? 2 i • •. are the conjugates oft?. SinceA(wi.«„) 5 ^ 0, 

also det la,-,! 9 ^ 0, and the system of equations (5) can be solved for 
the numbers 1 ,, d'*“\ giving linear expressions in wi,..., w„. 
Thus every number p of can be written in the form 


p = bi<ai + •••-!- bnWn, (7) 

where 61 ,..., are rational. We must show that they are rational 
integers. 

If this is not the case for the p of (7), then some 6 ,- has a nonzero 
fractional part: 

— [('ll + c, 


where 0 < c < 1 , and the symbol [b] means the largest integer not 
exceeding b. Put 


Pi — P — I 6 |]w,- = 61 W 1 + • • • + CWi + • • . -f 



52 


ALGEBRAIC NUMBERS 


[chap. 2 


In just the same way that (6) was deduced from (5), we can deduce 
from the system of equations 


0)1 = 0 ) 1 , 

0)2 = 0 ) 2 , 


0)f_i = Wj-1, 

Pi = biO)i + b20)2 + • • • + CO),- + • ■ • + bnU>n, 

Wi+1 = 0),-+i, 


the relation 

1 0 ... 

0 1 ... 

* • 

• « • 

A(o)i, . . . , Pi, . . . , 0)n) - i i ’ 

Ox 02 . • * C • • . 

• • • 

■ « • 

• • • 

0 0 ... 

= c^A(o)i,.. ., o)„). 

But this implies that the discriminant of the system o)i,..., p,-,..., o)„ 
is numerically smaller than that of oii, . . . , oin, and is not zero, which 
is contrary to the hypothesis that |A(o)i,...» o)n)l is minimal. 

Any two integral bases of a single field have the same discriminant, 
since each is the product of the other and the square of a determinant 
with integral entries, as in (6). The common value is called the 
discnminant of the fieU; we shaU designate it by A hereafter. 

PROBLEMS 

1. Let t?, and i?" be the roots of 

(a) + 2i + 6 = 0, 

(b) a:* — — X — 2 = 0. 

Compute the numbers NRM)(3t> — 2). 

Answer: (a) —206; (b) 4, 19. 




2^) 


UNITS AND PRIMES IN R[d] 


53 


2. (a) Let fix) = oox" H-h a-i be irreducible over «, and let 

t?, t?", . .. be the zeros of/. Show that in 

ao-ACl.t?, . = (-D-t"-*"’ rif (t5“0. 

• •>1 

[This depends on the well-known factorization 


I 

Xi 


1 

Xi 


1 

Zn 


xi"-* Xi"-' 


n—I 


= n (xi - Xi) 

l<j<i<n 


of a Vandermonde determinant.] 

(b) If in particular /(x) = x’-b px-H g, show that A(l,ty, t?“) = 
-27g' - 4p*. 

3. Show that if ai,..., an are elements of such that A(ai, .... an) 
is square-free, then ai,..., an form a basis for i2[t9]. 


2-4 Units and primes in If a and 0 are in an integral 

domain R[9], we say that /3 divides a, and write ^ja, if there is another 
element y of such that a = An integer e such that «|1 is 
called a unit of fi[dl. We say that a and /3 are associates if a = «/3, 
where t is a unit. 

Theorem 2-15. An element of /?[d] is a unit if and only if its norm 
{as an element of i?(d)) is ±1. 

Proof: If € is a unit and 

i" + + • • ■ + = 0 

is its defining equation, then the defining equation of 1/c is 

i" + ^ i”--‘ + • • • + i = 0. 

Since 1/c is an integer, e,,, = =tl, and Nfl/e) is a power of the con¬ 
stant term in the defining equation of l/e. (Alternatively, this result 
could be deduced from the multiplicativity of the norm. For if«is a 
unit, there exists an integer «i such that €€| = 1. Hence 1 = N1 = 
N€«i = Ne • Nci, and since the norm of an integer is a rational integer 
N€ = ±1.) 

Conversely, if the constant term in the defining equation of an 
element of R{i9] is ±1, then the reciprocal of the element is also an 
element of i?[d], and the element is a unit. 



54 


ALGEBRAIC NUMBERS 


(chap. 2 


The units of an integral domain form a multiplicative group, since 
the product of units is a unit, 1 is a unit, and each unit«has an inverse 
€i such that €€i = 1 . 

In the domain of rational integers, the only units are ±1; in the 
Gaussian domain the units are ±1, ztz. All these units are roots 
of unity, but in some domains there are units which are not roots of 
unity, and in fact do not have absolute value 1 . This was pointed out 
in Chapter 8 of Volume I, but we can now go into details. 

Let d be a square-free rational integer, and consider the field 
As a basis for the field we can take 1 , Vd, so that every 
element of 7?(\/d) can be uniquely expressed in the form a -j- 5\/d, 
where a and 6 are in R. If b = 0, then a + 5 Vd is an integer if and 
only if a is in Z. If 6 0, the defining equation of a + bVd is 

(x — a — b\/d)(x — a -h b\/d) ~ — 2 ax -i- — db^ = 0, 


so that if a -f bV^is in /?(\/d], both 2 a and — db^ must be rational 
integers. Hence (2a)^ — 4(a^ — db^) = 4d6^ is also in Z; since d is 
square-free, it follows that 26 is in Z. 

Suppose that a = fc + 5 , with k in Z. Then 

0 = 4a2 - Adb^ = -j- 4fc 4- 1 - ^db^ = I - ^db^ (mod 4), 

and it follows that 26 = 1 (mod 2), and d = 1 (mod 4). Conversely, 
if a and 6 are halves of odd integers and d = 1 (mod 4), the defining 

equation of a + 6 \/d has coefficients in Z. Hence 1 and (1 + y/d)/2 

form a basis of f?[\/d], if d = 1 (mod 4). 

If d = 2 or 3 (mod 4), then a must be a rational integer. If 6 were 

of the form k with k in Z, we should have 

0 = 4o2 - 4d62 = - ( 4 fc 2 + 4A + l)d = -d (mod 4), 

and d would not be square-free. Hence in this case both a and 6 
must be in Z, and 1, ^/d form a basis of 

Theorem 2-16. Let d be a square-free rational integer. Then if 
d = 1 (mod 4), the elements of R[Vd] are either of the form 

a -f bVd, a and b in Z, (8) 


or 

a + 6\/d 

- I 


a and b in Z, a — 6 — 1 (mod 2), 


2 



55 


2-4) UNITS AND PRIMES IN R[d] 

and the discriminant of R{y/d) is 

1 ^(1 - \/ d ) 


A = 


= {—y/d)~ = d 


// d = 2 or 3 (mod 4), all the elements of R[\^d] are of the form ( 8 ), 
and the discriminant of R{\^) is 

Vd 


A = 


1 

1 


— y/d 


= {-2y/d)- = 4rf. 


The units of R[\^\ are the integers e for which Nc = ±1. If d = 2 
or 3 (mod 4), then e is of the form ( 8 ), so that the units are given by 
the solutions of the Pell equations 

x-~dy^^±\. (9) 


If d s 1 (mod 4), the units are the integers of the form (x + yy/d)/2, 
where x + yy/d is a solution of one of the Pell equations 

- di/ = ±4. (10) 

If d < 0, these Pell equations have only trivial solutions; (9) has 
solutions ± 1 , 0 in all cases, and 0 , ±1 if d = — 1 , while ( 10 ) has the 
solution ±2, 0 always, and ±1, ±l if d = —3. If d > 0, equations 
(9) and (10) have infinitely many solutions.* 

Returning to the general domain R[d], we say that an element t is 
prime if it is not a unit and has no factors other than its associates and 
units. 


Theorem 2-17. Every nonunit element of can be written as a 
finite product of primes. 

Proof: If a in R[9] is not a unit, \Na\ >1. If a is prime, we have 
the trivial representation a = a. If not, there is a factorization 
a = into nonunits, and Na = N/3 • N 7 , where 

1 < |N^1 < INal, 1 < !N7l < iNaj. 

If either or 7 is not prime, it may be factored. The process must 
terminate, since the rational integer Na has only finitely many 
divisors of absolute value greater than 1 . 

♦This result is given in Chapter 8 , Volume I. The solutions for given d 
can be found explicitly with the aid of Theorem 9-6 of that volume. 



56 


ALGEBRAIC NUMBERS 


[chap. 2 


To see that this factorization need not be unique, consider the two 
representations 


21 = 3 • 7 = (4 + a/-5)(4 - V^) 


of 21 in V'~5 ]. Since —5 M 1 (mod 4), the integers of this domain 

are a + bV-5, with a and b in Z, and the units are ztl. It is clear 

that no two of the numbers 3, 7, 4 + V^—5, 4 — 's /—5 are associates, 

and we can also show that all of them are primes in R[\/~^]. Sup¬ 
pose that 


(oi + 6 i's/^)(a 2 + 62 \/^) = 3. 

Then 

N(ai -f 6 i\/^)N(a 2 + 62 V^) = N3 = 9, 

so that if neither factor is a unit, it must be that 

N(ai + biV^) = a ,2 5bi^ = 3. (11) 

This equation, however, has no solution in Z. By a similar argument, 
7 has no proper divisors, since the equation 

a ,2 ^ 5 (,j 2 ^ 7 ( 12 ) 

has no solution in Z. Finally, an assumed factorization of either 
4 it V —5 leads to the equation 

N (fli + 61 “s/ —5) • N (02 "h 52 'n/“ 5) “ 21, 

which in turn requires that either (11) or (12) hold. Hence fflV—5] 
is not a unique factorization domain. 

A domain is called a Euclidean cUmmin if for any pair of integers 
^7^0 and a of ^[t9], there is an element y such that 

|N(a - 0y)\ < IN^I- 

In this case, there is a Euclidean algorithm by means of which a 
greatest common divisor can be defined, such that if (a, ^) = 5, 
there are integers yi and 72 which ayi ^72 = 

this last property which is essential for unique factorization, since 
from it we get the result, equivalent to the Unique Factorization 
Theorem, that if 0\ay and (0, a) = I then ff\y. For if 71 a -f 72 ^ 
then 7 ia 7 + 72/37 = 7 ; hence/3l7. ThereisnosuchGCDinl^lV^]. 



IDEALS 


57 


2-5) 


For example, 3 and 4 -F V-5 must be considered as relatively 
prime, since they are nonassociated primes, but if we had 

3(a + 6V^) + (4+ V^){c dV^) = 1, a,b,c,dmZ 


it would follow that 

3a + 4c — Sd = 1, 3b + c + 4d = 0. 

Subtracting the second equation from the first, we would have 

3(a - i> + c - 3d) = 1, 


which is palpably false. 

Every Euclidean domain, then, is a unique factorization domain, 
although the converse is not true. The quadratic Euclidean domains 

are completely known: ^[VdJ is Euclidean if and only if d has one of 
the 21 values -11. -7, -3, -2. -1, 2, 3, 5, 6, 7, 11, 13, 17, 19, 21, 
29, 33, 37, 41, 57, or 73. 


PROBLEMS 

1. Show that f2[pl, where p = ( —1 + iV3)/2 is a cube root of unity, is a 
Euclidean domain. [Compare Theorem 7-6, Volume 1.) 

2. Find the ecu of 2 -|- p and 5 -|- 7p in fllpl. 

3. Show that if d is square-free, and if A is the discriminant of R{V7i), 
then the numboj-s 1 and (A + V'A)/2 form a basis of ft[N/pl. 

2-5 Ideals. One way of restoring unique factorization consists in 
enlarging the set of possible divisors; we might for example try to 

find entities A,B,C, and Z) of V—5J which are in some sense prime, 
and such that 

2 = AB, 7 = CD. 4 + = AC, 4 - \/^ - BD. 

Then the two representations of 21 in i?[V'^5l would no longer differ 
essentially; instead we would have 

21 = (AB)(CZ>) = (AC)(SZ)) = ABCD. 

To accomplish this without going outside the domain, we make a shift 
of emphasis; rather than asking for the divisors of a given number 
we look for all the numbers which have a given divisor. Here two 



58 


ALGEBRAIC NUMBERS (CHAP. 2 

properties of the divisibility concept, in which the divisor is fixed, 
come to mind: 

(a) If 7 ]a, then 7 |aX for every integer X. 

(b) If y\a and y\^, then 7 la ± 

In other words, the set of multiples of y forms an additive group which 
is closed under multiplication by elements of the domain (but not 
necessarily in the set). If a\ff, then the set of multiples of o contains 
the set of multiples of /3. The gcd (if there is one) of a and /3 has as 
multiples the set of numbers of the form a + where a and run 
independently over the multiples of a and 0 respectively, and this set 
is again an additive group closed under multiplication by elements 
of the domain. 

Because of the repeated occurrence of this special kind of set, we 
give the name ideal to any subset (containing at least one element 
besides zero) of an integral domain ^[d] which forms a group under 
addition and is closed under multiplication by elements of the domain. 
Since there is no reason to suppose that every ideal of /?(«?] consists of 
all the multiples of a single element of 7?[d], we shall designate a 
general ideal by a capital letter. A principal ideal, consisting of all 
multiples of a given element a of the domain, will be designated by [a]. 
(It will be clear from the context whether the brackets designate an 
ideal or the greatest-integer function.) But instead of a single number 
a, we could begin with any finite set ai, . . . , ot^ of elements of 
and form all expressions 

d" X2a2 + ’ * ' d" 


where X,,... , X™ run independently over fl[d]; the set of 
expressions again forms an ideal, which will be designa e y 

[ci_a..,]. (The numbers a„ . . . , Om are called generals of the 

ideal [a,,... , «„].) This notation is simUar to that for the gcd, if 

such exists, except that instead of writing (a, ^ 

write [a, /3l = It)- (Two ideals are said to be equal if they consist of 

rrsarn'e numbLs.l It will be shown later that flW is a unique 
factorization domain if and only if every ideal of d) is a principal 
ideal. This should not be surprising, since this latter condition 
simply requires that any two elements of R should have a gcd m 
fe[d] which can be expressed as a linear combination of the elements. 

Theorem 2-18. 1/RW «« »/ n, and A is an ideal of fljd! 

then there exist elements «/ «[■» sueh that every element of 



2-5) 


IDEALS 


59 


A can be uniquely represented in the form 

kiai + • • • + A*„an, ki, . . . , kn in Z. 

Remark: Note that the k’s are rational integers, and not elements 
of R[d\. The numbers ai,...»an of the theorem are called a basis 
of A ; they may be taken as a set of generators of -4, but may not be 
the smallest such set. 


Proof: If the polynomial defining an element a 0 of .4 is p(a;), 
then for some h, the zeros of p* (x) are the field conjugates of a, so that 

p*(x) = x" + aix"-^ + - - • d= Na, 

and Na = ±(a"“^ + + •••)« is in yl. Hence A contains a 

rational integer different from zero, and therefore a smallest positive 
integer, say a. If pi,. .. , pn is an integral basis of R[d], then A con¬ 
tains ap, for each i. Let an be the smallest positive rational integer 
such that the number 


= OllPl 

is in A. Since A contains OnPi and ap 2 , it contains numbers which 
are linear combinations of pi and pz with coefficients in Z. Of these 
there is one (not necessarily unique) for which the coefficient of pg is 
positive and minimal. Let it be 


012 — OziPj + a22p2- 

Similarly, for i/ = 3,. . ., n, put 

“ Qi-lPl + a,2P2 + • • • + OpfPw, 

where a,,- is in Z for 1 < i < v and a,v is positive and minimal for a 
in A. It is asserted that ai,..., a„ form a basis of A. 

Suppose that 


« - CiPi + •••-!- c„pn, Cl,... ,Cn in Z, 
is in A. Then so also is a — ca„ for every c in Z. Since 


0 < c„ - a„n [—1 < a„„. 

LcinnJ 


it follows from the minimality of a„„ that in the representation of the 
number a — (c„/an„]a„, the coefficient of p„ is 0, so that 


[c,-\ 

a - -- a„ 


- diPi + • • • + dn_ip„_i, di,...^d„ in z. 



60 ALGEBRAIC NUMBERS [CHAP. 2 

Repeating the argument, we find that 

Cn ”1 P dn—l ”1 , , 

- — \ - OCn—l = CiPi -f- • • ■ + en-2Pn-2) 

flnnj L®n—l,n—ij 

After n steps, we have 

j. -u 

LOll 

the desired representation. 

If there were two representations of the same number, their 
difference would be a nontrivial representation of 0: 

fcl“l + ■ * ' + = 0, + * * ■ + > 0* 

But then also 

+ • - • + =0, m = 1, 2,..., n, 

which implies that A(ai,. .., o!„) = 0, contrary to the equation 
A(ai, . . . , an) = 011^022^ ' * ‘ <Inn^A(pi, • . . , Pn) ^ 0. 

The proof is complete. 

From their definitions, it is clear that each coefficient an is positive 
and not larger than a, the smallest positive integer in A. We would 
like to show that bounds can also be put on the other coefficients a,,, 
I < j < i ^ n. We have 





cci = aiipi, 

«2 = fl2lPl "b <h2P2> 

OC 3 = flSlPl "b <*32P2 + ®33P8> 



an = OniPi + a„2P2 + «n3P3 + ‘ * * + <^nPn. 

Theorem 2-19. Every ideal in Rid] has a basis , an, given 

by (13), in which the numbers an are rational integers with 

0 < an < ^ 


Proof: It is clear that any system of numbers ai,...,a^i, 
- kci, o<+.,.... in which k is a rational integer and j ^ t, is 



IDEALS 


61 


2-SI 

also a basis of A. 


For if a is in A and 


a 


kiui + k2a2 + * • • + 


ni 


kiy * * *} kf^ iTi 2 , 


then 

a = kiai H-+ (fci + kki)aj H-h - kaj) H-h A:„a„. 

In the set of equations (13), subtract a suitable multiple of a„_i from 
an, so that the new coefficient of Pn—i is non-negative but smaller than 
On-i.n-i' Then subtract a suitable multiple of an- 2 » so that the new 
coefficient of Pn —2 is smaller than an_ 2 .a— 2 ; ti^s does not disturb the 
coefficient of pn-i- Continuing the process, we come eventually to a 
basis element an such that 0 < an/ < an, for i = 1,..., n — 1. 
Then we change an_i by subtracting off suitable multiples of an- 2 , 
an- 3 » ..•»«!» etc. The result is a basis as described in the theorem. 

Corollary. A positive rational integer occurs in only finitely many 
ideals of 


This follows immediately from the theorem, for if a is in A, then 
an < a> and there are only finitely many sets of coefficients a,-,- 
satisfying the conditions of the theorem. 


The discriminant of the elements of a basis of an ideal is called the 
discriminant of the ideal; its value is independent of the choice of 
basis. For if ai,..., a„ and a/,... ,an are bases of A, then there 
are hki in Z such that 


and 

Hence 


n 

Ok = Y, hkiai , fc = 1, . . . , n, 
1-1 


det \hki\ ^ 0 . 

A(a;, . . 4 , <*n) = (det A(ai , . . . , ), 


SO that the discriminants have the same sign and 

A(ai , . . . , an )|A(ax, • • • » an')- 


By symmetry, 

^(o!!, • • • , ttn)|A(a| , . . . , a„ ), 

and the discriminants are equal. 



62 


ALGEBRAIC NUMBERS 


(chap. 2 


PROBLEMS 

1. Show that every ideal in Z is principal. 

2. If .4 = [p, a -|- by/d] is an ideal of R[\/d\, where p is a rational prime 
and d is a square-free integer not of the form Ak + 1, show that p and 

a (6 — p\b/p\)y/d form a basis for A. 


2-6 The arithmetic of ideals. Ideals are special kinds of sets of 
elements. The emphasis so far has been on the elements comprising 
the sets. The whole power of the theory of ideals, however, lies in 
considering them not as collections of elements, but as entities in their 
own right, which can be combined according to certain operations. 

The first of these operations is multiplication. If 4 = [ai,..., ar] 
and B = , /S,], then the product AB is the ideal 

[oriiSj, . . . , ai0a, 0201 } • • • » «r^a]- 

The product ideal does not depend on the representation chosen for 
A and B. To show this, let AB = C, and suppose that also 

A = [a/,..., otf'], B = [^/,..., 0,']- 

To keep matters straight, designate these last ideals by A' and B , 
even though they are equal to A and B. We must show that every 
element of C is also an element of A'B' = C', and conversely. 

First of all, a/ is in A and is in B, so that we can write 

a/ = Xiai + * * • "b XrOtri 0 i ~ + * ' ' "b 


Hence the number 

= '^Vkicik0i 

is in C for 1 < i 1 < i < s'. Since C is an ideal, every linear 

combination of the numbers 0 / 0 / is in C; thus C is a su se o 
Hence C = C', by symmetry. 

Theorem 2-20. If A is an ideal of K[t>l, there exuts an ideal B of 
such that AB is a principal ideal [a], where a ts in Z. 

Remark: It is this theorem which is the crux of the whole matter. 
As indicated in the discussion at the beginning of Section 2-5 we are 
trying to enlarge the set of possible divisors of an integer by introduc¬ 
ing ideal elements. Given any such divisor, there should certainly be 
a second divisor whose product with the first is the original integer. 



63 


2_ej THE ARITHMETIC OF IDEALS 

Since we have taken divisors as sets, we must identify the original 
integer with the set of all its multiples. It should be noted that all 
the associates of a given integer generate the same principal ideal. 

Proof: Suppose A = [ai, . . . , arl, and put 

/(x) = ai + a 2 X + • • ■ + 

By representing ai,. . . , Or as polynomials in d, and replacing in all 
the polynomials by t? 2 , t? 3 . - - - * turn, we get sets a/ ar , 

where (- = 2, 3,. . . , n. We define 

g{x) = n («/•’> + a 2 <'>x + • ■ • + 

= j3l + 02 ^ + • • ' + *• 

The j3's are symmetric polynomials, with rational integral coefficients, 
in all the conjugates of ai,..., a, except ai,..., or themselves. 
Hence they are polynomials in ai,... , Or, with coefficients in Z, and 
therefore are in R[9]. It is asserted that the ideal B = [/3i, . . . , 0a] 
satisfies the conditions of the theorem. 

Put 

/(X)( 7 (X) = 7l + 72^ H-+ 

Since each 7 is a symmetric polynomial, with rational integral 
coefficients, in each a,- and its conjugates, the 7 ’s are themselves 
rational integers. Let their gcd be a. Then a can be represented as a 
linear combination of 7 i,..., 7 r+«-i. with coefficients in Z\ since 
7 i,..., 7 r+a-i are obviously in AB, a is in AB, and so [a] is a subset 
oiAB. 

If we knew that o divides every product then we would know 
that every element of AB is contained in (aj. The proof will therefore 
be complete when we prove Theorem 2-21, which is A. Hurwitz' 
extension of a theorem due to Gauss. 

Theorem 2-21. Let 

A (x) = aox*" “h • ■ • + Or, B{x) = 0ox“ + * * ■ + 0*, 

where ao 0 o 9 ^ 0 , be polynomials with integral algebraic coefficients. 
If an algebraic integer 6 divides every coefficient of 

C{x) = A{x)B{x) = CqX* +-h C(, 



64 


ALGEBRAIC NUMBERS [CHAP. 2 

in the sense that each quotient c,/8 is an algebraic integer, then 5 also 
divides every product a* • 

Proof: First we prove a lemma: if 

f{x) = 8oX^ + • • ■ + 80^0, 

is any polynomial xvith integral algebraic coefficients and a zero p, then 
f{x)/{x — p) has integral coefficients. The proof is by induction on u. 
If u = 1, then/(x) = 6qx + 5i, and 


f{x) _ 5oX + 5i 

X — p X -{■ 5i/8o 


is an integer. Suppose the lemma true for all polynomials of degree 
less than u. Then the polynomial 

Q{x) = fix) ~ SQX^~^ix - p) 


has integral algebraic coefficients (by the second part of Theo¬ 
rem 2-10), and has degree less than u and vanishes for x = p. By 
the induction hypothesis, 


Q(x) 

x — p 



has integral algebraic coefficients, and the same is therefore true of 
fix)/ix — p). The lemma follows by the induction principle. 

By repeated application of the lemma, we deduce that ii fix) — 
5o(^ - Pi) • * ■ - P«). then any product SoPi ■ ■ • p* is an integer. 

Returning to Theorem 2-21, suppose that 

Aix) = ao(x — pi) • * ■ (a; ~ P*-)) 

Bix) = — <ri) ■ • • {x — 


By assumption, the polynomial 

£(£) = _ PO • • • (X - o) 

5 5 


has integral coefficients, and it follows that any product 

1 < mi < m 2 < • • * < 


CtQ&Q 

Pn\ • • • Pni^mi • • • 



is an integer. Since and ft//5o are elementary symmetric func 



2_6) THE ARITHMETIC OF IDEALS 

tions in the p’s and o's, respectively, the number 

ctk0i CIO0O ^ ^ 

b b ao 00 

is a sum of terms of the form (14), and is therefore an integer. The 
proof is complete. 

Theorem 2-22. // AC = BC, then A = B. 

Remark: Note that there is no zero ideal. 

Proof: Let D be an ideal such that CD - [e], a principal ideal. 
Then ACD = BCD, so ^l[el = B[e]. Thus e times any element of A 
is equal to e times some element of B, and so A = B. 

If A = BC, then we say that C divides A, and write C|A. 

Theorem 2-23. A\C if and only if every element of C is in A. 

Proof: If A = [ai, .... and B = ()3i,, 0,], then AB = C = 

, ai0j ,.. .1, SO every element of C is in A, and also in B. 
Conversely, suppose that every element of C is in A. Then every 
element of CD is in AD, for every D. Choose D so that .ID = [e] is 
principal, and let CD = [oi, . . . , ff,]. Then for each i with 1 < i 
a, = eX, for a suitable integer X,-. Hence CD = [el[Xi, . . . , X,] = 
AD[Xi,.... X(], and by Theorem 2-22, C = A(Xi, . . . , X,l, so that 
AlC. 

Theorem 2-24. An ideal is divisible by only a finite number of 
ideals. 

Proof: If the ideal is A, choose B so that AB = [c], where c is a 
positive integer. Then c is in A and in every divisor of A, and by the 
corollary to Theorem 2-19, there are only finitely many such ideals. 

A common divisor of A and B which is divisible by every common 
divisor is called a greatest common divisor (gcd) of A and B. 

Theorem 2-25. Every pair of ideals A and B has a unique gcd, 
(A, B). It is composed of the numbers a + /3, where a runs over A 
and 0 over B. 

Proof: Let D be the set described in the theorem; it is clearly an 
ideal. Since 0 is in A and B, D contains every element of A and of B, 
and so is a divisor of A and of B. Any common divisor of A and B 



66 


ALGEBRAIC NUMBERS 


ICHAP. 2 


contains all the elements of A and of B, and since it is closed under 
addition, it contains all numbers a -{- /3, and so divides D. 

If D' is also a gcd of A and B, then D and D' are divisors of each 
other, and so each contains the other. Thus D = D'. 


If the GCD of A and B is [1], we say that A and B are relatively priine. 
As an immediate consequence of this definition and Theorem 2-25, 
we have 


Theorem 2-26. If A and B are relatively prime, there exist a in A 
and /3 in B such that a + /3 = 1. 


Theorem 2-27. If A\BC and A is ‘prime to B, it divides C. 

Proof: Choose a in A and /3 in B so that a ^ — 1. Then if 
y is in C, ay A- 0y = 7 , and 0y and ay are in A, so that y is in A. 
Hence AjC. 


If A 


has no divisors except itself and [1], then A is said to be prime. 


Theorem 2-28. Every ideal can be represented as a finite prod-act of 
prime ideals, and the representation is unique except for the order of 
factors. 

The finiteness of the representation follows from Theorem 2-24, and 
the uniqueness from Theorem 2-27. 

In particular, it follows that the principal ideal generated by any 
element of has a unique factorization into prime ideals of 
If these prime factors are themselves always principal ideals, we might 
expect that ideals can be dispensed with entirely, and that there is 
then unique factorization of the numbers themselves. 

Theorem 2-29. A necessary and sufficient condition that B[tf] be a 
unique factorization domain is that every ideal of B[d] be a princtpa 

ideal. 


Proof: Uniqueness of factorization in R[0] is equivalent to the 
property: 

if a\0y and aand0 are reUUively prime, then a\y. (15) 

For if the domain has this property, unique factorization can be proved 
in the usual way, while if factorization is unique and a\0y, then every 
prime ^ dividing a must occur in the factorization of 0y, since this 


67 


2_7] CONGRUENCES. THE NORM OF AN IDEAL 

factorization is the product of the factorizations of d and 7 , if ^ does 

not occur in /S it must occur in 7 . „ px . u 

Suppose that factorization is unique in ie[i3]. so that 
Then if tt is a prime number, [ir] is a prime ideal. For if [ttI - AB, 
where neither A nor B is [t], there would exist an a in A and a ^ in 
neither of which is divisible by tt, while their product is. 

Let F be any prime ideal, and a = tti"^ . . . TTr"' any element of P. 

Then 

[ a \ = [tt,]"' . . . W'*', 

and since a is in P, so is every element of [a], whence P\[a] and P is 
one of the principal ideals [tt*]. Since every prime ideal is principal, 

every ideal is principal. 

Now suppose that every ideal in is principal, and that a and ^ 
are relatively prime. Then [a, |3] = [ 7 ], for some 7 , and every linear 
combination Xa + is a multiple of 7 . Taking X = 1 and ;* = 0, 
we have 7 la; for X = 0 and a = 1. we obtain 7 !^- Hence 7 is a unit, 
[a, /31 = [1], and we can take 7 = 1 . Thus there are X and a such 
that Xa + m/ 3 = 1. so that if a\0y, then a divides Xa 7 + fi^y - y, 
and (15) holds. Hence factorization is unique. 

PROBLEMS 

1. Using Theorem 2-21, reformulate and prove the new version of 
Eisenstein’s irreducibility criterion, as given in Problem 1, Section 2-2. 

2. Show that if A = [a\ -f 5iVd. <»2 + b^s/d] is an ideal of R[s/d\, then 

the product of A with its conjugate ideal A' = (oi — fciVd, oj — b^y/d] is 
principal. 

2-7 Congruences. The norm of an ideal. Two elements a and 0 
of R[0] will be said to be congruent modulo an ideal A if their difference 
lies in A, that is, if A divides the ideal [a — /3). This is a natural 
extension of the earlier notion of congruence of rational integers, if 
the modulus m is identified with the principal ideal (m]. The familiar 
properties of congruences are easily seen to hold. 

For fixed a, the set of all elements of R[9] which are congruent to a 
modulo A is called a residue class modulo A. 

Theorem 2-30. There are only finitely many residue classes 
modulo A. 



68 


ALGEBRAIC NUMBERS 


[chap. 2 


Proof: Choose B so that AB = [c], where c is a rational integer. 
Then ai ^ a 2 (mod A) implies that ai 9 ^ 02 (mod[c]), since A\[c] 
and therefore A contains [c]. So if we can show that there are only 
finitely many elements, no two of which are congruent modulo [c], 
the theorem follows. But this is an immediate consequence of the 
fact that in the basis representation 


a = riwi + •' • -f r„w„, 

where wi,.. . , a»n form an integral basis of each of the rational 
integral coefficients ri,..., rn has only c possible values modulo c, 
and that if 

r, = r/ (mod c), i = I, , n, 

then 

ricoi + • • • + VnOin = r/wi - 4- rn'oin (mod [c]). 

The number of residue classes modulo A is called the norm of A, 
written N.4. For the time being, it is necessary to distinguish between 
Na and N[a], the norms of the number a and the ideal [a], respec¬ 
tively. However, we shall soon see that the two quantities are essen¬ 
tially the same. 

Theorem 2-31. // /2(i?) has discriminant A, and A is an ideal of 
7?(t?] having discriminant A(A), then 

A(A) = (NAfA. 

Proof: Let ai, - • ■ , a„ be the basis of A described in Theorem 2-19, 
and let pi, • • •, p„ be a basis of Then 

A(A) = (oii • ■ • an„)^A, 

and we must show that NA = an • * ■ a„n; that is, that 
an • Onn numbers of Rid], no two of which are congruent f 

and such that every element of R[d] is congruent to one of them. We 
show that this is true of the numbers 

riPl + • * ’ “b ^npn> 

where 0 < r* < for i = 1,.. •, If two of these numbers are 
congruent, say 

riPi + ■ • • + TnPn = ri'pi + * • • + TnVn (mod A), 
and rn > then 

(r, - r,')Pi + •••+(>•»- ^ 0 (mod4). 



2-7] CONGRUENCES. THE NORM OF AN IDEAL 69 

But is the smallest positive rational integer for which any number 
of the form 

5|Pi + • • ■ “H Sn—iPn—1 "H ®nnPn 

is in A; since 0 < r„ - < a„„, it follows that r„ - r„' = 0. 

Similarly, fn-i = ^n-i> • • •» • 

If 

0 = SlPl + • • • + SnPn. 

then 

— I 1 On = S/Pl + • • * + Sn_lP»_l + hnPn , 

J 

where 0 < 6„ < a„n. By iteration, 


/3 


_ [”—1 a„ - r 1 an-1 — • • • = hiPi H - \-Kp 

L^nnJ l*n—ij 


I 

where 0 < b* < at* for fe = 1,..., n, and 

P s biPx + • • • + bnPn (mod /I). 




Corollary. N(al = INaj. 

For api,..., apn is clearly a basis for [a], and 

A(api, . . . , apn) = (Na)^A, 

SO that (No)^ = (N[a])^. But N[al, being the number of residue 
classes, is positive. 


Theorem 2-32. If A and B are ideals, then there is an a in A 
such that ([a], vlB) = ^4. 

If such an o exists, then clearly [a] = AC, where (B, C) = [1]. 
If we rephrase the theorem, its close relation to Theorem 2-20 
becomes clear: given two ideals A and B, there is a C such that AC is 
principal and (B, C) = [IJ. 

Proof: Let Pi,..., P,. be the distinct primes dividing AB, and let 

A = Pi*' - ■ - P/^ ci > 0. 

Di= n P/^^ 


Put 






70 


ALGEBRAIC NUMBERS 


Since C-Di, ... , Dr) = [1], there are numbers 5,- in Di, for i 
such that 


[chap. 2 



+ • • • + 5 , = 1 . 

Then [6,] is divisible by D,-, and therefore by P* for k ^ i, and there¬ 
fore not by Pi, since 1 is not. Now let a,- be an element of P/* which 
does not occur in P,'*+^ for t = 1,. .., r, and put 

a = ai5i + •••-!- ar^r. 

Then for each i, every term but one in this representation of a occurs 
in P,**'^^, while the remaining term occurs in P,*‘ but not in Pi*’'*'^ 
Hence .A|[a], but 

e,.)= 

Theorem 2-33'. The congruence 

s (mod A) 


is solvable if and only if Z)|[/3], where D = ([o], A). The soliUion, if 
it exists, is unique modulo A/D. 

Proof: If I is a solution, then — i 8 = 7 is in A, and therefore in 
D. Since also 0 $ is in D, it follows that /3 is in D, so D\\0\. 

If /3 is in D, then it is the sum of an element of [a] and an element of 
A ; that is, /3 = + 6. Since 5 = 0 (mod A), & = (mod A). 

If = a^'=0 (mod A), then a($ - £') = 0 (mod A). Hence if 
[a] = DAi and A = DA 2 , then (Ai, A 2 ) = [1] and 

DA2\DAi[^ - 
- «'). 

J = {' (mod A 2 ). 

Theorem 2-34. N(AP) = NA • NP. 

Proof: By Theorem 2-32, there is a 7 such that 

(H.AP) = A. 

Let NA = ni, NP = 712 , and let ai, . - • , and /Si, ■ • ■ , be com¬ 
plete residue systems modulo A and P, respectively. We shall show 
that the nin 2 numbers + ypj form a complete residue system 

modulo AP. 



2-71 

If 


CONGRUENCES. THE NORM OF AN IDEAL 


71 


ai + = a* + (mod AB), 

yifij - ^i) = ak- oci (mod AB), 

and by Theorem 2-33, (M, .4B)|[at - c.l, so that dlK - “il- 
this gives a* = a,- (mod il), so fc = i. Moreover, 

yWi - |3i) = 0 (mod AB), 

- jS, = 0 (mod B), 

i = ?. 

To show that every integer 5 is congruent to one of the above 
numbere, choose a; so that 6 = 0 * (mod A). Then the congruence 

y^ s 8 — ui (mod AB) 

is solvable, since {[y], AB) = A is a divisor of [6 - ad- Finally, | is 
unique modulo B, and can therefore be taken to be one of the num¬ 
bers $j. 

Theorem 2-35. NA i$ an element of A. 

Proof: If ai,..., ajiA is a complete residue system modulo A, then 
so is ai -b 1,. . . , oha + 1- Hence 

ai + • • • + auA = («! + 1) + ■ * • + (ofNA + 1) (mod A), 

0 = NA (mod A). 

Corollary. There are only finitely many ideals of given norm. 

For by the corollary to Theorem 2-19, a positive rational integer 
occurs in only finitely many ideals. 

PROBLEMS 

1. Show that if F is a prime ideal of the congruence 

I" + + ■•■+««= 0 (mod P) 

with coefficients in F{t?) has at most m incongruent solutions modulo P. 

2. Show that if F is a prime ideal of a is an element of R[i^], and 
F|[al, then 



72 


ALGEBRAIC NUMBERS 


[chap. 2 


2-8 Prime ideals 

Theorem 2-3(J. If N.4 is prime, so is A. 

This follows immediately from Theorem 2-34. 

Theorem 2-37. There are infinitely many prime ideals P in any 
domain Each such P divides exactly one rational prime p, and 

NP = p^, where f, called the degree of P, is a positive integer not 
exceeding the degree of R(fi). 

Proof: Let p be a rational prime, and let P be one of the factors of 
[p] in P[t9). Then if P also divided the ideal defined by another 
rational prime p' , it would divide their gcd, which is [1]. Hence each 
P divides at most one p, and each of the infinitely many rational 
primes p is divisible by at least one P, so that there must be infinitely 
many P’s. 

Now let a be a rational integer such that P|(a]; by Theorem 2-35, 
we could take a = NP. If a = pi • • • p^, then 

P\[Pl] • • • [Pr], 

and so P|(p,] for some i. 

Finally, if P|[p] then [p] = Pi4 for some A. By the corollary to 
Theorem 2-31, 

N[p] = |Npl = p”, 

and so N(PA) = NP ■ NA = p". Hence NP|p”, and the proof is 
complete. 

Theorem 2-37 shows that the primes of Pit?] are to be found among 
the factors of the principal ideals [p]. Only partial information is 
available about the way these ideals decompose, and the derivation of 
most of what is known is too intricate for inclusion here, but we can 
prove the simpler half of a famous theorem due to Dedekind, which 
states that [p] is divisible by the square of a prime ideal in P[t?] if and 
only if p divides A, the discriminant of Rifi). 

Theorem 2-38. If p does not divide A, then [p] factors as a product 
of {one or more) distinct prime ideals. 

Proof: Suppose that P^\[p], so that [p] = P^M. Choose an element 

a of PM which does not belong to P^'Af, so that p|a but pja. Smce 

P > 2, p| (a^3)^ for every /3 in P[i?]. 



2-8) 


PRIME IDEALS 


73 


For an arbitrary element 7 of define S 7 , the trace of 7 , by 

the equation 

S 7 = 7^ + • • ■ + 

where y\ . .. , 7 *”’ are the field conjugates of 7 - By the Symmetric 
Function Theorem, S 7 is in Z if 7 is an integer, and it is clear that 
S(r'y) = rS 7 if r is rational. In particular, 

^ (aff)P _ Sja^r 

V V 

is in Z, so that S(aj3)^ is in [p]. By the multinomial theorem, if 
/S',, /3‘"’ are the field conjugates of then 

(S(a/3))P = + • ■ ■ + 

^ (a'|3')P + • • • + = S{{a^Y) 

s 0 (mod p), 

and since S(a/3) is a rational integer, plS(a/3). 

Now let pi,..., Pn be an integral basis for R{^]. Then 

a = h\Pi + • • • + hnPni 


where the h’s are rational integers not all divisible by p, since p\a. 
For 1 < i < n we have 


S(api) = S ( Ajpjpi) = Y, A;S(p,pj). 

\y-i / i-i 

Let 

d = det lS(piP,)l, 


and let ^4^/ be the cofactor of S(p,p/) in d. Then for fc = 1, 2,..., n, 


Since 


Aik 52 AjS(p,p,) = 

i-i i-i 


52 hj ^ .A,iS(pipj-) = dhk. 

i-l 1-1 


pIL hjS{pipj) 

for each i, it is also true that p\dhk for each k ; p therefore divides d. 
Finally, 


d = det lS(p.p,)l = det IZ = det = A; 

hence p|A. 



74 


ALGEBRAIC NUMBERS [CHAP. 2 

As an illustration of the present theorem, note that in the field 
R{i), of discriminant —4, we have 

[2] = [1 + i?, 

[p] = [a + hi][a - bi], H = p = I (mod 4), 

[q] = P, a prime ideal, if g s 3 (mod 4). 

Here [1 + i], [a + bi], and [a — bi] are prime ideals of degree 1, while 
each P is of degree 2.* 

Theorem 2-39. Each ideal [p], where p is a rational prime, splits 
into at most n ideal factors in the integral domain of afield of degree n. 

Proof: If [p] = />! . .. 

then p” = N[pl = NPj • • ■ NP„ 

and for each i, NP,- > 1. Hence r < n. 

PROBLEMS 

1. In the domain PfV — 5], put 

A = [3, 4 + V^], 5 = (3, 4 - V^], C = (7, 4 + V^], 

Z) = [7, 4 - 

Show that AB = [3], CD = [7], AC = [4 + V^J, BD = [4 - V^], 
and that A, B, C, and D are prime ideals. Factor [1 + S'v/—5]. 

2. Let R(y/d), w'here d is square-free, have discriminant A. If g is an 
odd prime dividing A, show that the ideals 


r?, ^ 

and 

r,, ^ 

L 2 J 


L 2 J 


are equal, and that their product is g. Show also that if A is even, then 

[21 = [2,-\/d]* for d = 2 (mod 4) and that [2] = [2, 1 + Vd]'^ if d = 3 
(mod 4). This completes the proof of Dedekind’s theorem, stated just 
before Theorem 2-38, in the case of a quadratic field. 

2-9. Units of algebraic ntimber fields. We saw in Section 2-4 that 

the units of a quadratic field P(Vd) are determined by the solutions 
of the Pell equation with = ±1 or ±4, and it is an easy conse- 

♦Compare the remark following Theorem 7-7, Volume I. 



2-0) 


75 


UNITS OF ALGEBRAIC NUMBER FIELDS 

quence of this relationship and standard properties of Pell’s equation* 
that the group of units of R{y/d) has a basis, consisting of —1 and 
the fundamental solution « of the appropriate Pell equation. That is^ 

every unit of R{Vd) can be written in the form (-l)“e^, where a is 
0 or 1 and j3 ranges over Z. We shall now show that this property is 
not peculiar to quadratic fields, but that in fact the group of units in 
each algebraic number field has a finite basis. (In general, if (? is a 
commutative multiplicative group and , b„^ are elements of G, 

they are said to form a basis for G if every element of G can be repre¬ 
sented in the form and in every such representation of 

the unit element e of G, the factor 6,"’ = e for 1 < f < ni.) This 
theorem, which is due to Dirichlet, can be sharpened by giving 
the exact number of basis elements, but for many purposes, including 
the application to be made in the next chapter, the finiteness of the 
basis suffices. The upper bound which we shall obtain is actually the 
correct number. 

We introduce the symbol to designate the maximum of the 
absolute values of the conjugates of the algebraic number a, and 
denote by K a fixed algebraic number field. 

Theorem 2-40. If a is a fixed positive number, there are only 

finitely many integers a of K such that 

< a. 

If all conjugates of a have absolute value 1, then a is a root of unity. 

Proof: If (cTI < a and deg a = n, then each of the elementary 
symmetric functions in a and its conjugates is numerically smaller 
than some bound depending only on a and n. If a is an integer in K, 
then n cannot exceed the degree of K, so that there are available only 
finitely many coefficients for the defining polynomial of a, and there 
are, therefore, only finitely many such a's in K. 

If la^‘^1 = 1 for t = 1, . .. , n, then = 1 for all m in Z, so that 
by what we have just proved, for some distinct exponents 

mi and m 2 . Hence = i, so that a is a root of unity. 

Theorem 2-41. The group U of roots of unity in K is a finite cyclic 

group. 

•See, for example. Theorems 8-5, 8-6, and 8-7, Volume I. 



76 


ALGEBRAIC NUMBERS 


[chap. 2 


Proof: If f is a root of unity, then = 1, and the finiteness of the 
group follows from the preceding theorem. Let the various elements 
Ui of U be primitive w.-th roots of unity, for { = and put 


w — max {wi ,..., Wt). 


For fixed i, the numbers and ^ 

h in Z. If (wi, w) = d, choose a and b so that aw + bwi = d; then 
the product 

2ri(a/wi+b/w) ^ ^2widlw^w ^ 2W/(tr^,w) 


is in U. It follows that the lcm of Wi and w does not exceed w, so that 
Wi\w for i = Since the powers of 


fo — 


2wi/w 


include all dth roots of unity if d\w, it is clear that fo generates U. 

Now let d, of degree n, be a primitive element of K, so that 
K = R{^), and arrange the conjugates of d in such an order that 
.... are real, while ..., are not real. (Note 

that it is not necessarily true that = d.) Then n — ri is an even 
number, say 2 r 2 , and we can further order the nonreal conjugates so 
that and are complex-conjugate, for j = 1,. ■ ■»»’ 2 - 

If a is any number of K, the field conjugates of a are such that 
. . . , are real, while and are complex-con¬ 

jugate for i = 1, . . . , ^ 2 . Of course some of these latter numbers 
may also be real, but in any case 

|^(n+j)] = for j = 1,. .., r 2 . (16) 

If ti.<* are units of K, they are said to be independent if the 

relation 

(jOi... = 1 , oi,..., a* in Z, (17) 


holds only for Ci = • • • = o* = 0. 

Theorem 2-42. Units »" ^ independent if and only 

if the sole solution of the system 


i — 1> 2,..., r. 


( 18 ) 


L x„log|<„«>| =0, 
m "1 

= ... = xi. = 0. Here r = ri + r 2 — L 


in rational integers is Xi 



77 


2-9] TJNITS OF ALGEBRAIC NUMBER FIELDS 

Proof: Suppose that (17) has a solution in which not all the a’s 
are zero. Then the analogous equation with each replaced by 
also holds, so that 

= 1 , i=l,...,n, 

and 

Z o.„log|£™«>l = 0, i = l,...,n. (19) 

m—1 


Conversely, if (19) holds with not all the rational integers Oi,..., a* 
equal to zero, then • • • e*®* is an integer of K all of whose conju¬ 
gates have absolute value 1; it is therefore a root of unity whose lyth 
power is 1, and (17) holds with ai,..., replaced by u?ai,..., wak. 
Hence the nontrivial solvability in Z of (19) is equivalent to the 
dependence of ei,..., <*. 

The truth of the theorem will now follow if we can prove that if 
the equations (19) hold with i = 1,..., ri -1- r 2 — 1, then the 
remaining n — ri — r 2 + 1 equations are also correct. To show this, 
suppose that the first ri -|- r 2 — 1 equations are true, and define 



for 1 < i < n. 
for ri -H 1 < t < n. 


Since each is a unit, its norm has absolute value 1; 

by (16), 

i:iog|€„<’^l= L eilogl€„(‘>l =0. 

»•! 

Hence 

* n+'i * 

L Om E C. log = E «» E Omlog = 0, 

so that 

* r,+r,-l k 

Cri+rj- E OmlOg = - E E Om log = 0. 

m-1 i-1 m-1 

Thus (19) also holds for t = ri -f- r 2 , and so, by (16), for 

i = 1, 2,..., n. 

Theorem 2-43. If the relation (18) holds for some set of real numbers 
‘ ,Xk which are not all zero, it also holds for rational integers 
*i» • • •, a;* which are not all zero. 



78 


ALGEBRAIC NUMBERS 


-- [chap. 2 

Proof: Suppose the hypothesis fulfilled. Since the system (18) is 
certainly nontrivially solvable in rational integers if some €„ is in V, 
it suffices to consider the case that all the units are of infinite order. 
Then each unit separately is independent. Now suppose that the 
units €i, . . . , 6, are such that the equations 

9-1 

t = 1,..., r, (20) 


L Otm log = 0, 

m =1 


have the single real solution «! = ••• = = 0, while the system 


E Omlogtcm^*^! = 0, 

m "1 


t 1,. . . , r, 


( 21 ) 


has a nontrivial real solution ai,..., a^. Then 2 < q < k, aq ^ Q, 
and the ratios ai/ocq ,.. . , are uniquely determined, since 

otherwise the differences of the respective ratios would provide a 
nontrivial solution of (20). If we can show that these ratios are 
rational numbers, the theorem will result by taking a suitable common 
integral multiple of the numbers 


Xm= < a 


— ioT\<m<q, 

9 

0 for 0 < m < fc. 


If we put aja, = for m = 1,. . ., ? - 1, equations (21) 
imply that 

log = e' log i = 1. . ■ • , (22) 


m 


Now consider the set of all units rt with the property that 

i = 1,. . . , n, (23) 


log V 


9-1 ... 

(,)| _ . 1— I- (0 


m "1 


for suitable real numbers .. . For such an r, the coefficients 

7 „ are unique. We caU the set 7 .,, 7,-i of real numbers proper 
if Ti, as defined in (23) with these 7 ’s, actually is a umt, and if m 
addition |7il < 1.b,-.l < >• If . y,-! is a proper set, 

then 


|log|,«|| < e‘ |log|e™<‘>l|, 

f7l “1 


and, by Theorem 2^0, there are only finitely many (say H) proper 



79 


2-9) UNITS OF ALGEBRAIC NUMBER FIELDS 

sets. On the other hand, if 71 ,, jq-i is proper, so also is 

Nyi - [A''7i], ..., Nyq-i - [iV'7^_i], 
if is a rational integer. For 

e' {Ny„ - [Ny„]) log = log - E log 

m-1 

which is the logarithm of a product of powers of units, and is there¬ 
fore the logarithm of a unit. Now if any jSm were irrational, then no 
two of the numbers Nfin — where N runs over Z, would be 

equal, and we should have infinitely many proper sets. This con¬ 
tradiction establishes the theorem. 

Theorem 2-44. If «i,..., e* are units such that the only real 
solution of (18) is the trivial solulion, then there is a rational integer 
M with the following property: in order that a number n such that 

log \7l^^\ = T. ym log 1 = 1 , . . . , Tl, 

m "•! 

be a unit of K, it is necessary that all the numbers Mym be rational 
integers. 

Proof: The hypothesis is that which was used in the preceding 
proof, except that we have replaced g — 1 by fc. Suppose that 
7 m = a/b, where o and b are rational integers with 6>0 and (a, b) = 1 , 
and m is one of the integers 1,..., fc. Then Nym — [A^ 7 m] assumes 
the b values 0/f>, 1/b,..., (6 — l)/ 6 , so that b < H, where H is the 
number of proper sets. Hence, 6 |H!, and we can take M = HI 

Theorem 2-45. The group E of all units of K has a finite basis, the 
number of basis elements of infinite order being cU most r. 

Proof: The system (18) of r linear homogeneous equations in fc 
unknowns is certainly nontrivially solvable in reals if fc > r, and it 
follows from Theorem 2-43 that there are at most r independent 
units in K. Let fc be the exact maximal number of independent units, 
and let €i,..., be such a set. Then by Theorem 2-44, for every 
unit ij of if there are gi,. . . , g* in Z such that 



80 


ALGEBRAIC NUMBERS 


(chap. 2 


By the second part of Theorem 2-40, and Theorem 2-41, it follows 
that 





Okyo 


so that Cl, , €*, J’o form a basis for the group of il/th powers of units. 
Now define the numbers ^o. • • •, h by the equations 


where an arbitrary but fi.xed -Vth root is taken in each case. The 
numbers may not lie in K, but they form a basis for a group of 
complex numbers, and E,\[ clearly contains £ as a subgroup. The 
theorem is therefore a consequence of the following general principle. 


Theorem 2-46. If G is a commulalive group having a basis of n 
elements, every subgroup of G also has a basis, of at most n elements. 

Proof: Suppose that Xt,. .., X„ is a basis for G, that 5 is a subgroup 
of G, and that some X,- actually occurs in the representation of some 
element s of 5. Let /,• be the set of all exponents which occur on X, 
in the representations of the various elements of S. If a is in It, so 
is ka for k in Z, and if a and a' are in /,, so is a — a'. Hence /,• is an 
ideal in Z, and is therefore a principal ideal, say /,■ = [fli*]- 
We now proceed by induction on n. If n = 1, then Xi®'* is a basis 
for S, by what we have just proved. Suppose that the theorem is 
true for every commutative group with n — 1 basis elements, and 
suppose that G has n basis elements, say Xi,..., Xn- Let 5 be a sub¬ 
group of G. If every element of S can be written in the form 

the result follows from the induction hypothesis. Otherwise, suppose 
that Irt = [a], and let X be an element of 5 in whose basis representa¬ 
tion X„ occurs with exponent a. Then for every s in S there exists a 
b, in Z such that sX^‘ has a representation 

The set of numbers of the form sX*** is therefore a subgroup of the 
group G' which has Xi,..., X„_i as a basis, and by the induction 
hypothesis this subgroup also has a basis, of at most ^ ~ ^ elemen s. 
This latter basis, together with X, clearly constitutes a basis for -S. 



REFERENCES 


81 


REFERENCES 

Section 2-4 

The complete tabulation of Euclidean domains is the work of many 
writers. K. Inkeri (.4nnafes Academiae Scientiarum Fennicae, Series A 
(Helsinki) I, Mathematics-Physics, 41, 35pp. (1947)) supplied the last link 

in a chain of theorems which together show that if d > 100, then Riy/d) 
is not Euclidean. E. S. Barnes and H. P. F. Swinnerton-Dyer {Acta Mathe- 
matica (Stockholm) 87, 259-323 (1952)) showed that, contrary to what 

had been believed, ft(V97) is not Euclidean. P. Varnavidcs (Proceedings 
Konink. Nederlandsche Akademie van W'etenschappen, iSen’es A (Amster¬ 
dam) 55, 111-122 (1952) or Indagationes Mathematicae (Amsterdam) 14, 
111-122 (1952)) showed that the values of d listed in the text yield 
Euclidean domains. 


Section 2-9 

The material of this section is adapted from E. Hecke, Vorlesungen iiber 
die Theorie der Algebraischen Zahlen, Leipzig: Akademische Verlags- 
gesellschaft m.b.H., 1923; reprinted by Chelsea Publishing Company, 
New York, 1948; pp.ill6-131. It is proved there that the upper bound 
obtained in the text is exact. 



CHAPTER 3 


APPLICATIONS TO RATIONAL NUMBER THEORY 


3-1 Introduction. As was suggested in the preceding chapter, 
there are many problems in rational number theory which are most 
naturally treated in the more extensive framework of an algebraic 
number field. Chief among these are various Diophantine equations; 
indeed, it was the study of Fermat's equation, + y" = z”, n > 3, 
which was originally responsible for the development of ideal theory. 
While this approach has not led to a complete verification of Fermat’s 
conjecture in all cases, it has produced results which would probably 
never have been obtained using rational methods alone. In the first 
part of this chapter we will discuss some results of this kind due to 
E. Kummer. Here heavy use will be made of ideal theory. 

The latter portion of the chapter is primarily concerned with a 
theorem due to B. Delauney and T. Nagell, which asserts that the 
cubic analog of Pell’s equation, 

+ (iy^ = 1 , 

has at most one solution in nonzero rational integers x, y, and com¬ 
pletely characterizes this possible solution. (In the next chapter we 
shall prove a less precise result about the general equation = 1, 

n > 3.) Use is made here of the insolvability in Z of 

+ 

but otherwise the two parts are mutually independent. 

3-2 Equivalence and class number. We say that the ideals A 
and B of /?[(?] are equivalent, and write A ^ B, if there are nonzero 
elements a and of such that 

[a\A = [e]B. 

It is easily seen that is an equivalence relation. Moreover, if 

82 



83 


3_2J EQUIVALENCE AND CLASS NUMBER 

A ~ B and C ~ i), then AC ~ BD, and ii AC BC then AB, 

Theorem 3-1. All principal ideals are equivalent. Any ideal 
equivalent to a principal ideal is principal. 

PraoJ: The first statement is trivial, since 

[am = l^][«]. 

If A ~ [a], then for some ^ and 7 , 

[jSjA = [a][y] = [ 07 ], 

and hence 

mM 

0\ay, 
ay = |35, 

m = [ay] = l^llS], 

A = [6]. 

Since equivalence is an equivalence relation, the ideals of R[9] can 
be separated into equivalence classes in the usual way. The number 
h of such classes is called the class number of the field; according to 
Theorem 3-1, A = 1 if and only if every ideal is principal, i.e., if and 
only if B[dl is a unique factorization domain. We shall now show that 
h is always finite. 

Theorem 3-2. There is a positive constant c, which depends only 
on the field, such that each ideal A divides a principal ideal AB for 
which 

NAB < cNA. 

Proof: Let pi,..., pn be a field basis, and let pi^‘\ ..., 

(« = 1,... , n) be the field conjugates of these numbers. We shall 
show that the theorem is true with 

c = n + ■ ■ • + ip„<'>i). 

Let A be an arbitrary ideal, and let k be the greatest rational 
integer not exceeding so that ft" < NA < (ft + 1)". Then 

if < 1 ,..., in range independently over the integers 0,1,..., ft, there 



APPLICATIONS TO RAITONAL NUMBER THEORY [cHAP. 3 

are determined (k I)" different integers 

tlPl + ■ • • + tnPn, 

and two of them must be congruent modulo A: 

^iPl + • • • + UnPn = Vipi + • • • + VnPn (mod^). 

Thus 


a — (Ul — yi)pi + ■ • ■ + {u„ — V„)pn 

is in i4, so that i4|[a], and 


N[ 


a] = |Na| = I n ( i; (u,- - 
< n Z = ck^ < cNA. 

a »1 f B 1 


n n 

< n z - 

•=i »=i 



Theorem 3-3. The class number of any algebraic number field is 
finite. 


Proof: It suffices to show that in each class there is an ideal B such 
that NB < c, by the corollary to Theorem 2-35. Let C be an arbi¬ 
trary ideal of a given class, and determine A so that .4C is principal. 
Then by Theorem 3-2, there is an ideal B such that AB is principal 
and N.dB < cN.d. Then AB ~ AC, B ~ C, and 

N^B ^ 

NB = —— < c. 

NA 


Theorem 3-4. If h is the class number, the hth 'power of any ideal is 
principal. 

Proof: If Ai, ..., Aa is a complete system of representatives of the 
various classes, and A is arbitrary, then AAi, . . . , AAa is another 
such system. Hence 

A\ • * ■ Aa AAi • ■ * AAa = A^A\ • * • Aa, 
so A* [1] and A* is principal. 

Theorem 3-5. If p is a rational prime and p\h, then A^ ^ B^ 
implies A B. 

Proof: Since p\h, there are positive a: and y in Z such that 

px — hy = 1. 



THE CYCLOTOMIC FIELD Kp 


85 


3-31 

From the fact that 


we have 
[a]A^ = [^]B^ 


[aYA^^ = 

[aYA^^'A = WYB^^B, 


and by Theorem 3-4, A B. 

Theorem 3-5 shows that the primes which do not divide h enjoy a 
property not shared by other primes. This is of great importance in 
the investigation of Fermat’s equation. 


PROBLEMS 

1 . Let a and 0 be algebraic integers, not both zero. Show that there is an 

integer 5 such that, first, 51a and 51)3 (in the sense that a/5 and fi/S are again 
integers), and, second, for suitable integers { and + ffv- Show that 

this ocD is unique up to an algebraic unit (i.e., an integer which divides 1). 
[Hint: First settle the case a/3 = 0. In the other case, let K be an alge¬ 
braic number field of class number h, containing both a and /3. Then 

= (-y], for some y in K. Let 5 be an integer such that 5' = y, and 
show that the equation (a, = ( 7 ) still holds when [a, /3) and [y] are 

interpreted as ideals in K{6). Deduce that (a, /3) = [5) in /C(5).) Does the 
Unique Factorization Theorem hold in the domain of all algebraic integers? 

2. Let K be an algebraic number field. Show that to each ideal A of K 
there corresponds an integer a (not necessarily in K) such that the elements 
of A are exactly those integers of K which are divisible by a. 


3-3 The cyclotomic field Kp. Let p be an odd prime, let 

*(i) = + • • ■ +1, 

and let f = e^'*'**, so that the zeros of $ are f, ... , the 
primitive pth roots of unity. The field i2(f) = = • - • = 

i) = Kp \s called a qfdotomic field. It is clearly of degree 
p — 1 at most. We put 1 — f = ir. (The fact that the symbol t is 
used for two different numbers should occasion no confusion; the 
number w = 3.14159 ... will occur only in the argument of the ex¬ 
ponential function.) 

Theorem 3-6. In Kp, the ideal (p) has the faciorizaiion 

[p] = 1^1^*; 



86 


APPLICATIONS TO RATIONAL NUMBER THEORY (cHAP. 3 

[ir] is prims, and N[7r] = p; 4* is irreducible, and Kp is of degree 
p~ 1. 


Proof: Since f is an integer of Kp, so is 


«r = 


r 


= 1 + f + ••• + 1 <r < p - 1; 


1 - r 

if now an r' is chosen so that rr' ^ \ (mod p), then f'"' = f, so that 

1 - i-"' 


= 


1 - f 


r = 1 + r + • • • + f 


r(r'-l) 


is also an integer. Hence €r is a unit of Kp, and 

p = *(i) = n’ (1 - r) = (1 - r)'^' n'= *(i - 

r-l r=l 

where « is a unit. It follows from this equation that [p] = 
and also that Ntt = p. By Theorem 2-39, deg iiCp > p — 1, so that 
[tt] is prime, deg Kp = p — and 0 is irreducible. (For a different 
proof of the irreducibility of 0, see Problem 1, Section 2-2.) 

Hereafter, we designate [ir] by P. 

Theorem 3-7. WHtin^ A(l, ..., = A(r), 

A(f) = 


Proof: From the representations 


0(1) = n (j^ — = 

r-l 


we obtain 


p(f' 

0'(o= n (r-r) = — 


xP - 1 

-I 

X — 1 

1> _ (fp' 

if - 1)" 


- 1 ) 


r-l 


Since 


A(r) = 


1 f 
1 r 


• * • 


r(i -r) 

j.p-2 

^2(p-2) 


1 r 


P—1 


f 


(P-2) (P-1) 


n (r - r)^ 

l<r<*<p-l 


« • • 



3-31 

we have 


THE CYCLOTOMIC FIELD K 


87 


p-i 


A(f) = n - !■'> 


s=l iSrSp—1 
T*» 


P-1 (— 

= n^'(r’) = NrN ~ 

Theorem 3-8. The numbers 1, f, - .., Jorm an integral basis 
for Kp, so that 

A = A(f)= 

Proof: Suppose that a is an integer of Kp, and that 

a = To + rif -h • • • + rp_2r^^i 
where the r’s are rational. Then for fc = 0, 1,... , p — 1, 

imO 


and since the trace function is clearly additive, 

S(r‘«) = ”£ s(r, ■!•'■+*) = ’z r;S(rt‘). 

i-0 ;-0 

Solving this system of equations for the numbers r,-, we obtain 

a determinant in a and ^ 

" det lS(r'f‘)l 

But as we saw in the proof of Theorem 2-38, det |S{f-'f*)l == A(f); 
since the determinant in the numerator has the rational value ryA(J'), 
and is clearly an integer of Kp, it is a rational integer. Thus a can be 
written in the form 


Cq + cil* + • • • + Cp_2r**”^ dp -j- diir + • ■ ■ -4- dp_2 T^^ 

^ pp—2 p**~^ ' 

where the c’s, and therefore also the d’s, are in Z. Since a is an 
integer, 

pI(do + diTT -!-■■■ + dp-2Tr*^^)» 

and since Pl(p], 


^l[do + diir + • • • + d;^2’r^^], 



88 


APPLICATIONS TO RATIONAL NUMBER THEORY (cHAP. 3 

SO P\[do\. It follows that NP|N[rfo], p\do^^, and finally p\dQ. This 
argument may be repeated p - 2 times, to show that p\dk for 
k = 1,.. ., p — 2, so that 

6o + eiT + • • • + ep_27r^^ 

p^3 

where the e’s are rational integers. Repeating the entire argument 
p — 3 times, we see that 

“ = /o + * ■ ■ 

where the/'s are in Z. Hence 1, t, . .. , form an integral basis 
for Kp. But from the equations 

^ = 1 - r = 1 - fl-, 

= 1 — 2^" + and = 1 — 2 t + ir^, 

• » 

• > 

• • 

we see that A(t) = a^A(t) and A(f) = a^A(ir), where o is a certain 
determinant with binomial coefficients as entries. Hence = 1, 
A(f) = A(7r), and 1, f,..., also form an integral basis, by 
Theorem 2-14. 

Theorem 3-9. If a is an integer of Kp, there is a rational integer, a, 
such that 

= a (mod P^). 

Proof: Since NP = p, the incongruent numbers 0, 1,..., p — 1 
form a complete residue system modulo P, so that for suitable b in Z, 

a s 6 (mod P). 

But 

= n (a - r*>). 

r-O 

and since = 1 (mod P), 

aP - fcP s n (a - 6) = 0 (mod P^), 

r-O 

so that we can take a = b^. 

If P|[a] and a = a (mod P^) for some a in Z, then a is said to be 
primary. 



3-31 

Theorem 3-10. 
f-^a is primary. 


THE CYCLOTOMIC FIELD Kp 

If irfa, then for some positive rational integer f, 


Proof: For suitable a and b in Z, 

a s a + (mod P^), 


and irfa, so that p}<i. Choose / so that 

a/ s 6 (mod p). 


Then since 

f/ = (1 - t)-^ = 1 - fir (mod P^), 

we have 

r^a = (1 - rf){a + bir) = a + ir(-a/ + 6) = a (mod P^). 

We now investigate the units of Kp. 

Theorem 3-11. The only roots of unity in Kp are the numbers if'", 
0 < r < p. 

Proof: The roots of unity are the numbers 

^witlm 

where t and m are rational integers, and ((, m) = 1. If such a number 
is in Kp, and if tt' = l(mod m), then also 

^2rUt'lm « ^2rilm 


is in Kp. The numbers mentioned in the theorem are the (2p)th 
roots of unity, so we need only show that is not in Kp if m|2p. 
If m|2p, then either A\m, or some odd prime q ^ p divides m, or 
p^|m. Suppose that is in Kp. 

If 4|m, then 

^2HI4 ^ i 


is in Kp. But then so are 1 + i and 1 — i, and 

(1 + i] = [1 _ i\ and [2] = [1 + i\\ 

contrary to Theorem 2-38. 

If q\m, then 

a = 

is in Kp. But then the reasoning used in the proof of Theorem 3-6 



90 APPLICATIONS TO RATIONAL NUMBER THEORY 

shows that 

[g] = [1 _ 

again contradicting Theorem 2-38. 

If p^\m, then 

^ = g2»t/p2 


(chap. 3 


is in Kp. But ^ is a zero of 

xP^ — 1 

——- = +... + iP +1 = n (x - r), 

phn 

and 

p = n (1 - D. 

l<m<p* 

Phn 

As before, the factors in this product are associated, and we get 

[p] = [1 - {]»<>-», 


contradicting Theorem 2-39. 

Theorem 3-12. Each unit € of Kp can be written in the form 


« = 


where g is a positive rational integer and r) is real. 

Proof: Express e in terms of the integral basis 1, ..., 

* = m, 

where / is a polynomial with rational integral coefficients. Then 
clearly «, = /(f") is also a unit, since Nt = ci ■ • ■ <j>-i = 

ep_. = = fir’) = /(P) = 

where the bar denotes the complex conjugate, so that |e*l >0, 

and 

i(p-X) 

n€ = n > 0, 

«-i 


($ 

<p-* 


so that Ne = 1. 
Since «« = tp-„ 


= 1 . 



3-3] 

The polynomial 


THE CTCLOTOMIC FIELD 


n(x--^)= n 

\ «/ •-! 


(«p_,a; - e.) 


hfus coefficients in Z, so, by Theorem 2-40, <i/ep—i is a root of unity, 
and by Theorem 3-11, 

«l = ± f"*!)-!- 
Since either m or p + m is even, and since 


we can write 

The proof will be complete if it can be shown that the plus sign is 
appropriate here, since then the quantities eit“® and are 

simultaneously equal and complex-conjugate, and are therefore real, 
so that e = «i = 

To show this, choose a from among 0,1,...» p — 1 so that 


s a (mod P). 


Then 


- a 


M = 


is an integer in Kp, as is 


A = 


- a - o 


Since 5f = 1 — is an associate of ir, it follows that 

- a 


is an integer of /Cp, so that 


= a = (modP). 


Thus 




— and 


«p_-i 


s (mod P). 



92 


APPLICATIONS TO RATIONAL NUMBER THEORY [CHAP. 3 


If the minus sign obtains in the equation, we have 

(mod P), 

= 0 (mod P), 

NP\2^\ 

contrary to the fact that NP = p. The proof is complete, 


PROBLEMS 

1. Let p and q be distinct odd primes, and let ^ be a primitive pth root 
of unity. 

(a) Show that 

p-i I(p-i) 

p= n 0-!-^) = n ar-r-y- 

a—1 a»l 

(b) Show that 

(r - (mod g). 

(c) Deduce that 

J(P—1) Off 

a ■! ” S 

(d) Show that the second factor on the right side of the last congru¬ 
ence above is ( — 1)'', whereis the number of numerically smallest residues 
(mod p) among g, 2g, . . . , ^(p - l)g which are negative, and so obtain a 

proof of the law of quadratic reciprocity. 

2. For an odd prime p and a positive integer h, put 


(mod g). 




ap* - 1 
— 1 




and let f be a zero of ■!>» and = KCf). Then the degree of is at most 

a(p'‘) = t. Put 1 - r = T, and W = P- 

(a) Show that in fCp», the ideal [pi has the factorization P>, P is prime, 

NP = p,$)i(x) is irreducible and is of degree (. 

(b) Show that A(r) = • P'' Nof' 

1 - is a zero of *i(l - x), an irreducible polynomial of degree 

p - 1 with leading coefficient (-U|- and constant term p, and deduce 

w”show that\ri^is any prile ideal in different from P, and if 
f* (mod L), then a = & (mod p^). 



3-4) 


FERMAT’S EQUATION 



3-4 Fermat’s equation. For the sake of completeness, we consider 
first the equation 

a-n 4- = 2« (1) 

for the cases n = 2, 4, and 3. When these have been disposed of, 
Fermat’s assertion would be proved if it could be shown that (1) 
has no solutions in rational integers x, y, z, with xyr 0, if n is a 
prime larger than 3. 

The proof that (1) is impossible when n = 4 depends on the 
following theorem, which characterizes the solutions of (1) when 



Theorem 3-13. A general ■primitive sohUion (t.c., a sohdion in 
which (x, y, z) = l) of 

^ = 2 ^^ y even, x>0, y>0, z>0 


ie given by 

I = y = 2a6, z = 6^, 

where a and b are prime to each other and not both odd, and a > b > 0. 

i?emarfe .• It is clear that one of x and y must be even, since other¬ 
wise 3^ + ^ = 2 (mod 4). There is no loss in generality in 
assuming that it is y which is even. 

Proof: Suppose that x®-b y® = x®. Since (x, y, z) — 1, also 
(y, z) = 1, so that (z — y, x -it y) = 1 or 2. But x is odd and y is 
even, so that (x — y, x -b y) ^ 1. Hence, from the equation 

X® = .(x - y)(x -b y), 

• 

we deduce that x — y and -Z + y must be squares, since they are 
positive. Now if t and u are fixed integers of the same parity (both 
odd or both even), there are integers a and b such that t = a -b b and 
u = a — b. Hence we -caai put 

X - y = (a- - b)®, X -b y = (a + b)®, 

2 

V = 2a6, 

A 

X = (a - b)(a -b b) = a® - b®. 


which gives 



94 APPLICATIONS TO RATIONAL NUMBER THEORY (CHAP. 3 

Since (z — x, z + x) = (2a^, 26^) = 2, we must choose a and b so 
that (a, b) = 1. Since x is odd, a + 6 must be odd. Since y > 0, 
a and b must have the same sign, and since i > 0, \a\ > |6|. Since 
the pairs a, b and —a, —6 give the same solution, we can suppose 
that a > 6 > 0. 

Theorem 3-14. The equation x^ — z^ is not solvable in noTir 
zero rational integers. 

Proof: It suffices to show that there is no primitive solution of 
the equation 

X* + y* = 

Suppose that j, y, and z constitute such a solution; with no loss in 
generality we can take x > 0, y > 0, 2 > 0, and y even. Writing 
the supposed relation in the form 

we have from the preceding theorem that 

x^ = = 2a5, 2 = 0 ^ + b^, 

where (a, 6) = 1 and exactly one of a and b is odd. If a were even, 
we would have 

I = x^ — a^ — = —1 (mod 4), 

so 2\b. We apply Theorem 3-13 again, this time to the equation 
+ b^ = c?, and obtain 

X = _ g2^ i) = 2pg, a = + /, 

where (p, g) = 1, P > 3 > 0, and not both of p and g are odd. From 


y^ = 2ab 


we have 


_ 4pg(p^ + g^). 

Here p, q and p‘+ ^ are relatively prime in pairs, so each must be 


a square: 


from which 


p = r 




r* + s* = t^. 


Now 




2 = 0 “ + 6“ = r« + 6r*s‘‘ + s®, 


y = 2rst, 



95 


3-4] FERMAT’S EQUATION 

SO that z > (r"* + s ) = 1 , 

or i < zh It follows that if one solution of j’* + y* = '''ere 
known, another solution r, s, t could be found for which rst 9^ 0 and 
0 < i < zi. But this would give an infinite decreasing sequence of 

positive integers. 

The case n = 3 is rather more difficult, since it is necessary to work 

in the quadratic field where f = ( — 1 + i\/3)/2 is a 

primitive cube root of unity. Not all the complications of the general 
case are present, however, since there is unique factorization of the 
integers of A' 3 , as the following theorem shows. 

Theorem 3-15. Given any two integers a and y of K^, of which 
y 9 ^ 0 , there are integers x and p such that 

a = xy p, 0 < Np < N7. 


The integers of therefore form a Euclidean domain. 

Proof: Since 1 and f form an integral basis for K 3 , we can write 

a a + 6f (a + 6r)(c + 

; " c + dr " c^^cd-\-d^ « i- ‘>f . 


where a, b, c, and d are rational integers, and R and S are rational. 
Choose X and y in Z such that 

\R-x\<h |S-y|<^; 

then 


- - (x + yr) 

7 


= (/e-x)2- (i?-x)(S-y) + (5 



Hence, if x = x + yr and p = a — xy, then 

Np < JN 7 < N 7 , and Np = pp = \p\^ > 0. 
Theorem 3-16. The equation 

+ = 0 ( 2 ) 

has no solution in nonzero tn^eyers of K^. It therefore has no solu¬ 
tion in nonzero rational integers. 

Proof: We first note that one of £, 77 , and d must be divisible by the 
prime t = 1 - r, if (2) holds. For put 

$ + »j = p, *7 + 1? = <r, 


d + { = T. 



96 APPLICATIONS TO RATIONAL NUMBER THEORY [CHAP. 3 

Then a simple calculation, using (2), shows that 

(P + 0- + t)^ = 24p(7T. 

Since the expression on the right side of this equation is divisible by 

3 = -fV, 

the left side must be divisible by r, and therefore by tt®. Returning 
to the right side, it follows that one of p, <r, or r must be divisible 
by TT. If 7r|p, then t1(^ + t}^), so and finally 

If there were a common factor in two of rj, and t?, it would also 
occur in the third, and could be divided out; so suppose that (2) 
holds, that ij, and t? are relatively prime in pairs, and that 

By Theorem 3-10, we may suppose that an appropriate power of 
f has been introduced into ^ and so that 

f = 1, t; = — 1 (mods), 

which we express by putting 

e = l + 3a, »;= -1+3)3, 

where a and /3 are integers of K^. Put 

, f + r-» „ n + v „ + v ). 

A = -» o = -» C — j 

IT ir r 


these numbers are integers of K 3 , since 

A = 1 H— (a + 

T 


B = 

C = 


-1 + -(f« + /S), 

■K 


- (a + • 

T 


Moreover, 


X + B + C = 0, 



(3) 

(4) 


-^A + fB, r, = fA- tB. (5) 

From (5) we see that (A, B) = 1, since otherwise { and , would 
have a common factor. From (3), also {A^ €)—{,) 



rummer’s theorem 


97 


3-5) 


It follows from (4) that A, B, and C must all be cubes, say 
A = >p^,B = x^>C = and 

+ x" + = 0. 


Now A ^ I, Bs-l, C = 0 (mod x), 

so that from (4), ^ contains a smaller power of x than does d. 

Repeating the argument a sufficient number of times, we would 
arrive eventually at a solution of (2) in which no variable is divisible 
by X, which is impossible. 


3-5 Kummer’s theorem. If p is an odd rational prime, and its 
associated cyclotomic field Kp has class number h, then p is said to 
be regular if p\h. According to Theorem 3-5, if p is a regular prime 
and A and B are ideals in Kp such that A^ B^, then A B. It 
was this essential property of the regular primes which enabled 
Kummer to prove that Fermat’s conjecture is correct for all regular 
primes. (Unfortunately, there are infinitely many irregular ones.) 
We shall not be able to prove Rummer’s theorem in its entirety, but 
shall have to assume without proof a difficult preliminary result. 
We can, however, prove the following theorem. 


Theorem 3-17. If p is regular, the equation 

z" + yP + = 0 



has no solution in rational integers x, y, z for which p\xyz. 


Proof: Suppose that the theorem is false, and that x, y, and z 
satisfy all the requirements. We can assume that (x, y) == 1 and 
p > 3; as usual, f is a primitive pth root of unity, and = (1 f]. 

From (6) we obtain 


so that 


n' (x + r*/) = 


m *0 



n' [x + r!/i = ui”. (7) 

m 


Now no two of the factors on the left have a common factor. For, 
if Q is a prime ideal such that Ql[x + {■’"lyl and Q\[x + for 
mi < m 2 , then 


Qi(n(i - 



98 


APPLICATIONS TO RATIONAL NUMBER THEORY [CHAP. 3 

and hence Q\P[y]. But from (7), Q|[ 2 ], so Q 9 ^ P (since pjz); hence 
Q|[i/]. But then also Q|[a:], and we deduce that and 

which is contrary to the assumption that {x, y) = 1. 

It follows that each factor on the left side of (7) is the pth power 
of an ideal. If 

[x + fyl = A^, 


then ~ [1] = [1]^, so that by Theorem 3-5, A itself is principal, 
say A = [a]. Then 


Hence 


Ix + fy] = [aV = [a1. 


X + ty = toP, 


where € is a unit of Kp. Using the canonical form for units in Kp 
obtained in Theorem 3-12, we have 


X + fy = 0 < y < p -* 1, 


where y is real. By Theorem 3-9, since [p]\P^, 

= a (mod [p]) 


for some a in Z, so that 

X + fy = (mod [p]), 

where <7 is a real integer of Kp. The complex conjugate of the integer 

t~‘^(x + fy) - O' 

V 


is also a field conjugate, and is therefore also an integer. Since 
p = p and J = cr, we have 

<r= {xA- ^y)r^ (mod [p]), 

and 

a s (j + r^y)t^ (mod [p]), 


so that 


xr° + yf^"" - ° 


( 8 ) 


Two of these exponents must be congruent modulo p. For suppose 
that they are all distinct, and put 

^ V V V V 



99 


3-5] rummer’s theorem 

Then pfi has a representation in terms of distinct elements of an 
integral basis, the coefficients not being divisible by p. But since 
/3 is an integer, p^ also has a representation in which the coefficients 
are divisible by p, and this is contrary to the definition of a basis. 
We conclude that g must have one of the values 0, 1, or (p + l)/2 
(that is, = 1 (mod p)). 

If g = 0, the congruence (8) gives 

yt - vr' = 0 (wiod [pl)» 

whence, since — 1 is an associate of ir, 

v\y, 

which is false. If ^ = 1, then (8) yields 

x{l - f) =0 (mod [pD, 

which implies that p\x, which is also false. Finally, if = (p + l)/2, 
then from (8) we get 

(x - y)ir = 0 (mod [p]), 


which gives 

X = y (mod p). 

Interchanging y and z in equation (7), we deduce that also 

X = z (mod p). 

But then equation (6) implies that 

x** + y** + 2 ** = Sx** = 0 (mod p), 

which is false since p > 3 and p\x. Hence the theorem is not false. 

Because of its methodological interest, we deduce the general 
Kummer theorem from the following lemma, whose proof is too long 
for inclusion here: 


Kummer’s lemma. Let p be a regular prime. Then if e is a unit 
of Kp and a is a rational integer such that 

f = a (mod P^), 

then € is the pth power of another unit of Kp. 

This is a partial converse of Theorem 3-9. Using it, we can 
generalize Theorem 3-17 in two ways: by allowing x, y, and z to be 



100 


APPLICATIONS TO RATIONAL NUMBER THEORY [CHAP. 3 

integers of Kp instead of rational integers, or by dropping the restric¬ 
tion that p\xyz. 

Theorem S-IS. If p is a regular prime, the equation 

yp _|_ 2? = 0 

has no solution in nonzero integers x, y, z of Kp for which ir\xyz. It 
therefore has no solution in nonzero rational integers x, y, z for which 
p\xyz, and therefore (Joy Theorem 3-17) no nonzero rational integral 
solutions. 

Proof: We first show that the equation 

jP _f_ ^p = Tr\xyz', e a unit of Kp, (9) 

has no nontrivial solution if w = 1. Equation (9) is a generalization 
of the equation obtained from (6) by supposing that z — zV", where 
Tr\z'. 

We may suppose that x and y have no common numerical factor, 
since it would also occur in z and could be canceled out. (Notice that 
it cannot be assumed that the ideals [x] and [y] are relatively prime, 
since [x, y] may not be principal.) We may also suppose that x and y 
are primary, since they may be multiplied by appropriate powers of f 
without affecting (9). If (9) is written in the form 

n' (I + ry) = o') 

m ^0 

it Is clear that at least one of the factors on the left, say x + f !/> is 
divisible by t. Since, however, the differences 

(X + f*y) - (x + ry) = (f* - f')!/ 
and . 

r‘'(x + fV) - !■*(* + fy) — 

are also divisible by v, each factor on the left in (9') must be divisible 
by IT. If two factors were divisible by ir*, we would have 

ir"l (f‘ - V)y, 

ir^le'jry («'a unit), 

and similarly t\x, contrary to assumption. On the other hand, since 



kummbr’s theorem 


101 


3 - 5 ) 


X and y are primary, there is an a in 2 such that 

X y = a (mod P^). 


But then 


a = X y = 0 (mod P), 


pW, 

P^\[al 

X + j/ = 0 (mod P^). 

Thus the total number of factors of ir on the left side of (9^) is at least 
p + 1, so that u > 1. 

Now rewrite (9') as 

n' [x + ry] = (9'0 

m *0 

Any common factor different from P of two ideals in the product on 
the left side of (9^^) must be a factor of both [x] and [y], and therefore 
of ( 2 ^). After dividing out every such common factor, as well as one 
factor of P from each ideal, the ideals remaining on the left are pair¬ 
wise prime, and their product is a pth power; therefore each factor 
separately is a pth power. 

Combining all these results, we can write 

[x-\-y] = D, 

[x + = P V D, m = 1 , . .., p - 1 , 

where D = [x, y), and Jq, Ji,..., Jp_i are certain ideals not divisible 
by P. If we put U = p{u — 1) -b 1 or 1, according as m = 0 or 
m > 0 , we have, for m ^ I, 

[X -I- fylP‘" D = [x + ty][x + ry] = PJi^ D[x + ry]; 
since P is a principal ideal, it follows that 

D ~ JjP D, 

so that Jtn** Ji**, and by Theorem 3—5, ~ Thus integers 

and 6 m (which are not divisible by ir) exist such that 

bfmWfn = Wi = 0, 2, 3, . . . , p - 1 . ( 10 ) 

Raising both sides of (10) (with m = 0) to the pth power, and then 
multiplying through by D we have 



102 


APPLICATIONS TO RATIONAL NUMBER THEORY [CHAP. 3 
[yoVD JqP = DPJ^- 


Similarly, 


so that 


[Yo^][x + y] = [X + ty] 

D P[y27J2^ = P P[h? 

[x + fy]b2^] = [x + rj/lM, 

yo^ix + y) = 6i(x + 

yz^ix + fV) = «2(a; + ^yW, 


( 11 ) 


where €i and €2 are units. 

We now use the identity 

(x + i-2y) + (i + 3,)f = (x + r2/)(l + f). 

We multiply through by 7o'*72^, aiid ^ the resulting equation replace 
the left sides of equations (11) by the right sides. After canceling the 
common factor x + fy, there results 

*2(7052)” + *ir>r'<“-‘>(r25o)” = (1 + r)(70T2)”. 

Since e,, *2, r, and 1 + r = (1 - ?*)/(! - f) are units, this equation 
is of the form 

^ ( 12 ) 

where €3 and €4 are units and Tr\^. By Theorem 3-9, 

F = ail = “2 

where oi and 02 are rational integers; since u > 1, (12) gives 

di -|- € 3(12 — 9 (mod J^^). 

Since irfi,, also irfoj, so that pt«2- Choose 03 so that 0303^ 1 (mod p^); 
then 

agfla = 1 (™od ^'’)> 
aiOs + *3 = 9 (mod P*’). 

By Kummer’s lemma, €3 “ *6*’) becomes 

Which is an equation of the form (9) with « replaced by « - 1 • 
Repeating the argument u-2 times, we would have a solution of (9) 

with u = 1, which is impossible. 



103 


3-6) THE EQUATION + 2 — t/’ 

Before lea\'ing the subject of Fermat’s conjecture, it might be of 
some interest to mention certain other facts known about it. We 
consider only the solvability of equation (6), 

+ + = 0 , 

in 2. 

It was proved by Wieferich in 1909 that if (6) holds m integers 
X, y, and z such that p\xyz (the so-called Case I), then 

2 ^“^ = 1 (mod p^). 

Later investigators have shown that in Case I, 

= 1 (mod p^) 

for every prime q ^ 43 j J. B. Rosser used this fact to show that 
there are no solutions in Case I for p < 41,000,000. D. H. and Emma 
Lehmer later extended Rosser’s method to prove Fermat’s conjec¬ 
ture in Case I for p < 253,747,889. This in turn implies that if 
there is a solution in Case I, it must be that log log z > 23. 

Without the restriction to Case I, Theorem 3-18 disposes of the 
regular primes. Kummer also found criteria to handle the irregular 
primes less than 164; this was pushed on to all p < 619 by H. S. 
Vandiver and his collaborators, and quite recently D. H. and E. 
Lehmer and Vandiver have used high-speed computing techniques 
to settle the problem for all p < 2000. It turns out that of the 302 
primes less than 2000, 118 are irregular; while it is not known that 
there are infinitely many regular primes, there is nothing in the 
limited data available to indicate that there are only finitely many. 

3-6 The equation -f- 2 = y®. For the remainder of this chapter 
we shall be primarily concerned with the cubic analog of Pell’s 
equation. At one point in the argument, however, we shall need the 
following auxiliary result. 

Theorem 3-19. The only solutions in Z of the equation 

+ 2 = (13) 

are x = ±5, y = 3. 

Proof: Following Euler’s idea, we make use of the arithmetic of 
the quadratic field i2(\/^). By Theorem 2-16, the integers of this 



104 


APPLICATIONS TO RATIONAL NUMBER THEORY (cHAP. 3 

field are of the form a + by/—2, where a and b are rational integers. 
By a proof exactly paralleling that of Theorem 3-15, it can be shown 
that they form a Euclidean domain: given a, b, c, d in Z, with cd^^O, 
there are e, f, g, h in Z such that 

a + 6-S/-2 = (c + d\/^)(e f\/^) + (^ + kV^), 

/ + 2A2 < + 2dK 

It follows that R[\/—2\ is a unique factorization domain. 

We first show that if x and y satisfy (13), then x -f- and 

X- are relatively prime. It is clear that 

(x + V^, X - y/^)\-2y/^, 

and since — 2v—2 = (\/—2)^ and \/—2 is prime in the domain (by 

Theorem 2-39), it must be that (x -f V—2, x — \/^2) = (-\/—2)” 

0 < m < 3. But if X + = (a + by/^) y/^, thenx = —26, 

whence, by (13), 

462 + 2 = 

= 2 (mod 4), 

which is impossible. 

Since the only units of R{y/ —2) are ±1, it follows from (13) that 

X + y/^ = (a + by/^f, 

where a and 6 are rational integers, and equating real and imaginary 
parts gives 

— 6ab^ — x, 

3a26 - 26^ = 1. 

From the second of these equations it follows that 6 = ±1, and hence 
thatSa^ - 2 = ±l,ora = ±1. From the first, x = ±1 =F 6 = ±5. 

3-7 Pure cubic fields. The field L = R{y/d), in which d > 1 is a 
cube-free rational integer and y/d is real, is called a pure cubic field. 
In this section we determine an integral basis for L and note certain 

other properties. 

Since d is cube-free, we can write 

d = ab^, 



3_7) PURE CUBIC FIELDS 105 

where ah is square-free. Since \/d^ = b\/a^b, the numbers 1, 

form a basis for L. Following Dedekind, we say that L 
is of the^irsi or second kind, according as 9 does not, or does, divide 
- 6^. The reason for the distinction is made clear in the following 

theorem. 

Theorem 3-20. The numbers 

1, \/^b 

form an integral basis for L if it is of the first kind. The numbers 

J(1 -I- a\/^ + 

form an integral basis for L if it is of the second kind. 

Remark: Note that the second basis represents every integer 
represented by the first, since 

+ a->y^ , 


3zi 


(23 — 621 )'V^^ 

+ 23 


Proof: Suppose that w is an integer in L, and that 

w = xi + X 2 ^ab^ + X3->y^, xi, X 2 , Xz in R. 


Then the conjugates of w are 

«' = Xi + pX 2 ^V^ + p^X3^a%, 

u" = Xi + p^X2^ab^ H- pxz^y^, 

where p is a primitive cube root of unity. We see that 

u -H 4* oj" = 3xi, 

a^b{(o -H p^u + pip”) = 3abx2, 

-b pu' -|- p^oi”) = 3abx3, 

and since the left sides of these equations are algebraic integers and 
the right sides are rational, it follows that the numbers 3xi, 3abx2, 
Sabxs are rational integers. Hence for any integer o) in L, there are 
yi, y 2 , yz'^Z such that 

3abw = 1/1 + Vz^ab^ + yz^a%. 


(14) 



106 


APPLICATIONS TO RATIONAL NUMBER THEORY [CHAP. 3 


We show first that ab is a divisor of yi, y 2 , and t/ 3 , and so can be 
omitted in (14). 

Let p be a rational prime dividing a, and let P be a prime ideal of 
L which divides [p]. It was supposed that ah is square-free; ajoriiori, 
{a, 6 ) = 1, and P\[b]. If we put 

8 = 



a = Va 


then P|[a]^, so since L is of degree 3, it follows from Theorem 

2-39 that [p] = P3. Hence Pl|[a] and 
Now suppose, in accordance with (14), that 

Vi + l/ 2 « + 1 / 3/3 = 0 (mod 3a6). 

Then 

1/1 + 1 / 2 ® + 1/30 = 0 (mod P^), 


yi = 0 (mod P), 


(15) 


( 16 ) 


(17) 


Vi = 0 (modp), 
j/i = 0 (mod P^), 

1 / 2 " + 1/30 = 0 (mod P®), 
y 2 a = 0 (mod P^), 

7/2 = 0 (mod p), 
y^ff = 0 (mod P®), 

j /3 = 0 (mod p). 

By equations (15), (16), and (17), and the fact that p was an arbi¬ 
trary prime divisor of a, we see that a divides yi, y 2 , 1/3* Sinu- 

larly, b divides yi, y 2 , and ys. It follows that there are zi, Z 2 , 23 

Z such that 

3w = 2l + 22« + ^30* 

Let the defining equation of w be 

-h c^x^ + CiX + C 3 = 0, Cl, C 2 , C 3 in Z. 

Then by (18) and the analogous equations for ZJ and 3w", 

Cl = — (oj -h -h y') ~ ~^u 

C2 = + 0)0," -f- yy' = - «^22Z3), 

C3 = -COW w" = —irizi + ohW + - 3a52i2223). 


(18) 


(19) 

( 20 ) 



PURE CUBIC FIELDS 


107 


3-7) 


Suppose that Sja; then Z\h, and L is of the first kind, 
in Z, Z\zi. Since C 3 is in Z, 

0 = -27c3 = 31 bW (mod 9), 

o 


Since C 2 is 


whence 31 z 2 . and by (20) again, Sjza- In this case, then, the numbers 
1 , v'^o^ constitute an integral basis for L. A similar argu¬ 

ment applies in the case that 316. 

Suppose now that 3ja6, so that 

= 6 ^ = 1 (mod 3). (21) 

If 3 l 2 i, then by (19), SlzaZg; if 3 lz 2 , say, then it follows from (20) 
that also 3 lz 3 . Similarly, if S\z 2 , then also 3lzi and 3 lz 3 . Hence 3 
divides all or none of Zi, Z 2 , 23 ; in the first case u is of the form speci¬ 
fied in the theorem. 

We now examine the possibility that w in (19) is an integer, but 
that 3 |zjZ 223 . Then by (20), (21), and Fermat’s theorem, 

zi^ + a 6 W + a^ 6 z 3 ^ = 0 (mod 3), 


2 i -)- (IZ 2 623 = 0 (mod 3), 


Zy = 022 — ^^3 (mod 3), 

Z 2 = 021 , 23 = bzy (mod 3), 

22 = 02 i + 3 ^ 2 . ^3 = 621 + 3 ^ 3 * 

Substituting these expressions for 23 and 23 into (20), we obtain 

— 27 c 3 = 2i^ + ab^iazi + 3 ^ 2 )^ + 0^6(621 + 3/3)® 

— 3 obzi(a 2 i + 8(2) (621 + 8(3) 

= 21^(1 -I- 0*6^ + a==6^ - Za %^) 

-f 9zi^{a%H2 + a%% - a%t3 - ab%) 

-H 2721(026^^22 + 0262/3=" - 06/2^3) + 27(062/2^ + 026/3®) 

= Z1®(1 + 0^62 -h o 26* - 3a262) 

+ 92 i 2 (a 62 /(a 2 - 1) + 026 / 3(62 - 1)) (mod 27). 

By (21), 

0 = -27c3 = 2 i®(l + a*62 + o2b* - 3a2b2) (mod 27), 
and it follows that 

1 -h a26* - 3a262 ^ 0 (mod 27). 


(22) 



108 APPLICATIONS TO RATIONAL NUMBER THEORY [cHAP. 3 

Using (21), we can put 

-\-3f + 9fif, 

where / and g are rational integers and 0 < / < 2. Then the con¬ 
gruence (22) reduces to 

<p{f, a) = 2a® + (9/ - 3)a^ + 9(f - f)a^ + 1 = 0 (mod 27). 
For / = 0, this becomes 

(a^ - l)2(2a2 + 1) = 0 (mod 27), 
which is true for every a not divisible by 3, since 

2a2 + 1 = - 1 = 0 (mod 3). 

Moreover, for every a such that 3|a, 

^( 1 , a) = <p{Q, a) + 9a* = 9a* 0 (mod 27), 

<p{2, a) = ip{0, a) + 18a^ + 180^ = 18a2(a=^ + 1)^0 (mod 27). 
Thus we find that if 3\ziZ2Z3, than cz is in Z if and only if 

^ Q (mod 9), 

(i.e., if and only if L is of the second kind) and 02:2 = bzz (mod 3). 
If this is the case, then Cy and C 2 are also rational integers, and 

w = i(2j + (azi + 3 ^ 2 )“ + + 3 ( 3 )^) 


= Zl 


1 + aa + 6/3 


+ tza + tzfi 


is an- integer in L. The proof is complete. 

In the course of the proof, it appeared that if u = (* + ya + «d)/3 
ie an integer, and if one of x,y,z is divisible by 3, all of are^ 
In particular, if a: + ya is an integer and x and y are rational, they 

" wi'noT^nsider the units of L. U Lis of the first kind, then 

17 = a? + ya + 2^ 

is a unit if and only if Nij = Wn” = ±li 

+ abY + ^ 


(23) 



TWO LEMMAS 


109 


3-8) 


If L is of the second kind, then 



(1 + aa + 6/3) -{‘Va + wfi 


is a unit if and only if 

+ ab^y^ + a^6z^ — 3a6xyz = ±27, (24) 

where u = x, au + 3y = y, and 6u + 3u) = z. If v is positive, the 
plus sign must be chosen in (23) and (24), since y and y are com¬ 
plex conjugates. 

The field L has the property that each of its elements is either 
rational or of degree three. For if there were an element of degree 
two, L would be an extension of the field generated by that element, 
and so would be of even degree. It follows that ±1 are the only 
roots of unity in L. Since a has one real and two nonreal conjugates, 
we see by Theorem 2-45 that either L has only the units ±1, or 
else there is a fundamental unit which may be chosen between 
0 and 1, such that every unit tj of L can be expressed in the form 

V = 

where n is a rational integer, positive, negative, or zero. 

A positive unit of the form ij = x + ya is always smaller than 1. 
For since x^ + dy^ = 1, we have 

r}~~^ = — xya -b > 1 "b a -|- > 3, 

since xy is negative. Consequently, for such a unit we have 

Tl = $”, n > 0. 

The same remarks apply to a positive unit of the form x -b z0. 


3-8 Two lemmas. For simplicity in notation, we define the 

binomial coefficient to be zero for k > m. Here and hereafter 

in this chapter, lower-case Latin letters stand for rational integers, 
unless otherwise specified. 

Theorem 3-21. Let mbe a positive integer. Then 

(o) + ( 3 ) + (e) + ■ ■ ■ ^ “ (mod3). 



no 


APPLICATIONS TO RATIONAL NUMBER THEORY (cHAP. 3 

Proof: Put 

»•■(;)+©+(:)+■■ 

»■-(?)+(:)+&)+■■■ 

Then 

5o + -Si + 52 = 2”* = (-ir(inod3), 

and 

_ /m\ m — 1. /m\ m — 4 ^ 

^2 — ( j ) - -h ( ^ ^-!"•••= —mSi + 5i (mod 3), 

„ /m\ m /m\ m — 3 « , , 

5i = (^Q y Y \3 / —4-!-•■■ = mSo (mod 3), 

so that 

(1 + 2m — m^)5o = (—1)*" (mod 3). 

Theorem 3-22. Suppose that x and y are integers such that 
{x, dy) — 1, and suppose that 

(x + y\/dY = X + + Ziyfdf, 


where X, Y, and Z are rational and n > 1. Then XYZ 0 except 
in the following cases: 

{\/m - 1)® = 99 - 45->/io, 

(^_1)4= _15H-12v^. 

Proof: Since (x, d) = 1 , it is clear that X ^ 0. Suppose that 
Z = 0 , so that 



+ ... = 0. (25) 


Dividing by 


( 2 ) 


this becomes 


— X 


fi-2 


(n - 2 \ 

3ifc /(3A: + l)(3fc + 2) 


(26) 



TWO LEMMAS 


111 


Let ^ be a prime divisor of y. Then since > 2^^ > 3A: + 2 for 
fe > 1, each term in the last sum is divisible by q, which is impossible 

since (x, y) = 1- Hence y = ±1. 

When n s 0 (mod 3), equation (26) can be written in the form 

/n - 1\ j;3*u"-3fc-3^(n-3)-* 

.f, (ji) - 


when n s 1 (mod 3), 


dk + 1 


n - 2\ 


= 

*tA 3A: J 


y _ 

(3fc + 1)(3A: + 2) 


and when n s 2 (mod 3), 

_yn-a^(n-2) ^ ^ f 

The same argument now shows that x = ±1, and since it is clear 
from (25) that xy < 0, we have x = — y. 

Now let g be a prime divisor of d, and suppose that g**||d (that is, 
g“|d but g'^+^+d). If g“ > 5, then g^* > 5* > 3it + 2 for k > 1, so 
that each term in the sum in (26) is divisible by g, which is impossible 
since (x, d) = 1. If g = 3, then since 3j(3fc + l)(3fc + 2) we reach 
the same contradiction. Hence g“ = 2 or 5, and d = 2, 5, or 10. 
The information obtained so far shows that 


/n - 2\ 2d /n - 2\ 2d=^ 

V 3 j45"^\ 6 ^7-8 


= 0. 


(27) 


If d = 10, this becomes 

C 3 ~ ^ - 4n + 6) 

*£2 ' \ Zk J (3fc + 1)0* + 2) 

This equation is true for n = 5, and leads to the first of the excep¬ 
tions mentioned in the theorem. For other values of n, we may 
divide through by (n - 5)/6 and obtain 

n*—4n+6 

= - T (n-2)(n-3 )(n-4) ■ 12-10*+* 

*ai \3*-l/3*(3*+l)(3*+2)(3*+3)(3*+4)(3*+5)' 




112 


APPLICATIONS TO RATIONAL NUMBER THEORY (cHAP. 3 

The highest power of 5 which divides the denominator of a term in 
the sum is clearly at most 5(3A: + 5), and since 5*+^ > 5(3A: + 5) for 
k > 2, we have 


4n+6 


= (7i-2)2+2s 


/n-6\ (n-2)(n-3)(n-4) ■ 12- 10^ 
\ 2 / 3-4-5-6-7-8 


=0(mod 5), 


which is false since —2 is a quadratic nonresidue of 5. 

When d = 2 or 5, equation (27) leads to the congruence 

^ 3 ^) + (" 6 ^) + ■ ■ ■ " 0 (mod3), 

which is false by Theorem 3-21. 

There remains only the possibility that Y = 0. The proof that 
this happens only in the case of the second exception mentioned in 
the theorem is completely similar to what has just been done for the 
case Z = 0, and we leave the details to the reader. (The only varia¬ 
tion lies in the fact that d may now have the sole prime divisor 2, 
so that d = 2 or 4.) 


3-9 The Delaunay-Nagell theorem. As we shall see in the next 
chapter, there is a general theorem which implies that the equation 

ax^ -j-by^ = c (28) 


has only finitely many solutions in integers x, y if a, 6, and c are 
nonzero integers. In certain special cases, however, it is possible 
to make more precise statements about the number and nature of 
possible solutions. We shall concern ourselves here with the equation 

4- dy^ = 1, (29) 


which was first considered in detail by B. Delaunay. His method 
was later refined by T. Nagell, who also applied it to (28) in the case 
that c = 1 or 3. Nagell’s result concerning (29) is as follows. 


Theorem 3-23. Equation (29) has at most one solution in integers 
X, y different from zero. //^i> yi a solution^ the number Xi + y\ 
is either the fundamental unit of L = R{^) or its square; the 
latter can happen for only finitely many values of d. 



3_9] THE DELAHNAY-NAGELL THEOREM 

If = ±1 (29) has only trivial solutions. If d contains a cube 
larger than 1, it can be absorbed into the factor y\ Hence we can 
assume that d is cube-free and larger than 1. 

The idea of the proof is quite simple. If 

N(xi + yi\^) = + dy,^ =1. Vi 0, 

then Xi + yiV^ is a positive unit of L, and as such is a positive 
power of the fundamental unit f mentioned at the end of Section 3-7. 
It therefore suffices to show that no power of a positive umt smaller 

than 1, with exponent larger than 2, is of the special form x + y 
and to show that the square of a unit is of this form in only finitely 
many cases. We divide the proof into four parts, summarized in the 
next four theorems. 

Theorem 3-24. The square of an irrational unit of L of the form 

rt = X ya z&, x,y,zmZ 
is itself of the form X -i- Ya only if 

r}= 1 + 


The square of a unit of L of the form 

V = J(x + ya + z/3), Z\xyz, 

{if such exists) is itself of the form X Ya for only finitely many 
values of d. 

Proof: Let rf = x -p ya + z^ 

be a positive unit of L, so that, by (23), 

x^ + ab^y^ + a%z^ — Zabxyz = 1 (30) 

and 

= {x^ + 2abyz) + ( 2 x 1 / + az^)a + {2xz + by^)(3. 

If the coefficient of /3 in this last expression is 0, then 

by=* 


and substituting this into (30) we obtain 

X^ + d»=-d2^3 + 3d^= 1, 



114 


APPLICATIONS TO RATIONAL NUMBER THEORY (cHAP. 3 

or _ 20x^di/3 - 8(i® - X®) = 0, 

whence dy^ = lOx^ ± 2xV27x* - 2x. (31) 

Thus the number 27x* — 2x must be a square: 

(27x® - 2)x = (32) 

If X is even, then (27x® — 2, x) = 2, so that 

27x3 _ 2 = X = =k2v\ 

Since — 1 is a quadratic nonresidue of 3, we must choose the lower 
sign, and eliminating x we obtain 

108y® + 1 = 

(u - l)(u+ 1) = 108y®. 

Since (w — 1, u + 1) = 2, this implies that 

u ± 1 = 54r®, w =F 1 = 2s®, 

whence 

27r® - s® = (3r2)3 - (s^ = d=l. 

From the truth of Fermat’s conjecture for n = 3, it follows that r = 0, 
which gives y = 0 and x = 0. But then also y = 0, by (31), which 
is impossible since z0 is not a unit. 

If X is odd, (32) yields 

27x3 _ 2 = X = 

Here the upper sign must be chosen, and we have 

(3x)3 - _|_ 2. 

which by Theorem 3-19 has the sole solution x *= 1, u = ±5. By 
(31), dy3 = 10 dz 10, so that d = 20, y = 1. (If y = 0, then z = 0, 
and r} is rational.) The sole solution is therefore 

(1 + = -19 + 7 -^. 


Now let ij be a positive unit of the form 

ij = J(x + ya + 2^). 

Then by (24), 

3-3 _|_ 4 - a%z^ — Zahxyz = 27 , 

and 1 1 

9,3 ^ ( 3.9 ^ 2ahyz) 4* (2xy 4- + (2» 4- hy )0. 


(33) 

(34) 



115 



THE DELAUNAY-NAGELL THEOREM 


If Z\x, also 3\y and 312, and we have already treated this case. Sup¬ 
pose that 3\x. If the coefficient of ^ in the expression for n is 0, we 

again have 

bu^ 



Substituting this into (33), it follows that 

dy^ = lOx^ ± GxVSx* - 6x, (35) 

so that 

3x* - 6x = (36) 

If X is even, the fact that 3^ implies that 

X* - 2 = ±6u^ X = ±2y2, 

whence 

±4y® - 1 = ±3u=*. 

Since 3|(4y® + 1), we must choose the upper sign; the last equation 
can then be written as 

(u -h \f - (u - If = {2v^f, 

so that lu| = 1. Hence x = 2, and by (35), dy^ = 80 ± 72. The 
lower sign yields d = 1 or 8, both of which are excluded. Hence 
d = 19j t/ =5 2, and 2 = — 1. The only solution in this case is 

( 2 + 2^^- ^ ^ 3^ 

If X is odd, (36) implies that 

x^ — 2 = ±3u^, X = ±y*, 

so that 

±y® - 2 = rbSu^. 

The lower sign must be chosen: x = — y* and 

3u2 - 2 = y®. (37) 

But it is an immediate consequence of Theorem 4-17, to be proved 
in the next chapter, that (37) has only finitely many solutions, and 
the proof is complete. 

We note for future use that if u, y satisfy (37), then v must be odd. 

Theorem 3-25. The fourth power of a positive irrational unit of L is 
never of the form X + Ya. 



116 


APPLICATIONS TO RATIONAL NUMBER THEORY (cHAP. 3 

Proof: Let € be such a unit, 

« = i(2:i + yiot + 

and suppose that 

= X+ Ya, 

Then since the coefficient of in t* is 0, we have 

+ Aah^yi^zi + l2ahxiyiZi^ + = 0. (38) 

If we put 

V = ^ = \{x-\-ya-^ z^), 

then 

X = + 2ahyiZi), 

y = \{2x^yx + 021^), 

2 = \{2xiZi + hyi^). 

Since = X + Fa, we can apply Theorem 3-24. The cases 

d = 20, x = y = —z = Z, 

d=19, X = y = 2, z = —\ 

are impossible, since in the first the above equation for z becomes 
— 9 = 2xiZx + 2yi^, while in the second the system is easily seen to 
be inconsistent for all choices of signs of x\, yi, z\. Hence it must be 
that x = —v^, where v is odd, so that 

Zv^ + xi^ = —2abyiZ\. 

Since v is odd, so is x\, so that Zv^ + = 4 (mod 8). Hence three 

of the numbers a, b, yi, zj are odd, and the fourth is even. By (38), 
a^bzi^ is even, so yi is odd. If either a or Zi is even, (38) implies 
that 6bxiV = 0 (mod 4), which is false. If b is even, (38) implies 
that a^bzi'^ = 0 (mod 4), which is false since b is square-free. The 
proof is complete. 

Theorem 3-26. The cube of a positive irrational unit of L is never 
of the form X + Ya. 

Proof: If 

V = + ya + 2^) 

is a positive unit, the coefficient of ^ in is 

^(bxy^ -h x^z + aby^). 



3_9] THE DELAUNAY-NAGELL THEOREM 117 

We see from the equation 

a;3 _|_ ^ = 27 (39) 

that {x, b) = I, and deduce from the equation 

bxy^ + x^z d- abyz^ = 0 (40) 

that b\z. From (39) again, 5 = (x. y,z) = I or 3. Since rj ^ ±1, 
y and z are not both zero, and we can write 

X = 8 did 2 Xu y — 5^2^31/1. 2 = Bbdid^Zi, (41) 

whsr6 

and xi > 0, vi > 0, zi > 0. The numbers diXi, diyi, are rela¬ 
tively prime in pairs. Substituting the values from (41) into (39) 
and dividing by 6 %did 2 dz, we obtain 

di^d^xiyi^ + di^d 2 Xi^zi 4- ab^did^^yiZi^ = 0. 

It follows from this that di|xi, d2lyi. and daUi- Putting 

xi = diX2, yi = d2y2, 2i = <^ 3 X 2 , 

substituting, and dividing by didfd^, we obtain 

d2^X2y2^ + di^X2^Z2 + cib^d3^y2Z2^ = 0 . 

A consequence of this is that X2|a6^d3®i/222*t 'vhioh in turn implies 
that X2 “ 1. Similarly, 1/2 = 22 = 1» so that 

di® + d2* + abW = 0 (42) 

and 

X = hdi^d 2 , y — 5d2^d3, z ==. Sbdid^^. 

Substituting these values into (39), we obtain 

di'dj’ + ob'daV + a‘b*diV - Sat’d,3dj*d3» - ^ ■ 

Eliminating 06*^3^ between this equation and (42), we have 

d.» + ed.Vj’ + 3diW - dj“ = . (43) 

and putting di^ = u, da® = v, 3/6 = w, this becomes 

u® 4- 6u^y 4- 


( 44 ) 



LiO APPLICATIONS TO RATIONAL NUMBER THEORY [cHAP. 3 

But it is easily verified that 

(u^ + Quh + duv^ — v^)U^ = -f W^, 

where 

U = -j-uv-i -F = + 3 u 2 y _ = 3 ^ 2 ^, 3 yy 2 _ 

Since neither U nor V is zero for relatively prime u and v, (44) can 
hold with w ^ 0 only if W = 0, that is, if u = —v. In this case 
w = V. Since (di, da) = 1, it follows that di = -1, da = 1, 5 = 3. 
This, however, leads to the values x = 3, y = -3, z = 0, for which 
the coefficient of /3 in is not zero. 


Theorem 3-27. If p > 3 is prime and 

V = \{x-\-yotz^) 

is a positive unit smaller than 1, then is not of the form X + Fa. 


Proof: Suppose that z = 0. Then 3|x and 3|y, and 


- ©■+- (I)’ - 


so that 



= 1; by Theorem 3-22, the coefficient of 0 in y^ is 


not zero. Thus z 9 ^ 0. By the same reasoning (applied in the field 
L' - R(fi) — R(a) = L), y cannot be zero. 

As we saw in the proof of Theorem 3-20, it follows from the 
representation 


<0 = Xj + Xaa X3 j8 


of an arbitrary integer a> of L that 

a(a) + ptjj' + p^oj^^) = 3a&X3. 

Taking w = y^, we see that if the coefficient of in y^ is zero, it must 
be that 

[ — 3 — ) i ) 

+ + zpffj ^ ^ 

Suppose first that p = 1 (mod 3). Then since = p, (45) can 
be written in the form 




119 


3_9| the DELAUNAY-NAGELL THEOREM 

+ yp^c, + ^ ^ xp'^ + ypcc + zff Sj ^ + yg + xfi Sj 

Since p is odd, the left side is divisible by 

Xp + yp^a + 2/3 . Xp^ + ypa + 20 -x - ya + 220 ^ 

-i-+ 3 3 

this number is an integer, and since it divides it is a unit. Conse¬ 
quently, 

—x^ — ab^y^ + 8a%z^ — Qabxyz = ±27. 

Since »j is a positive unit, also 

X® + abV + a^ 62 ® — dabxyz = 27, 


and by addition, 

9a^b2® — 9abxyz = 0 or 54. 

In the first case — xy must be zero. But this number is the 
coefficient of a in 3/n, and as we saw at the end of Section 3-7, it is 
not zero, since l/i? > 1. 

In the second case we have 

abz{az^ — xy) = 6. 


But then x, y, z are not all divisible by 3, so that L is of the second 
kind. This is impossible, since if ab\9 then ^ 0 (mod 9). 

The case in which p = 2 (mod 3) proceeds similarly. Equation 
(45) can be written in the form 




xpM-yoH- 2 ^Y /xp + yg + 


+ 






X + ya + 20\P 


3 / • \ 3 

from which it follows that the number 

xp^ + ya + 2p0 , xp + ya + 2p^0 —x + 2ya — 20 

A ^ _ 


) 


is a unit. As before, 

9o6^y^ — 9a6xy2 = 0 or 54. 

Since 6y^ - xz is the coefficient of 0 in 3 /t?, it is not zero. But it is 
also impossible that (by^ — X 2)|6 and abjfi, since then L must be of 
both the first and second kinds. The proof is complete. 



120 


APPLICATIONS TO RATIONAL NUMBER THEORY [cHAP. 3 

Theorems 3-25, 3-26, and 3-27 show that any nonzero solution of 
dy^ = 1 must correspond either to the fundamental unit of L, 
or to its square. Not both of these numbers can lead to solutions, by 
Theorem 3-22 with n = 1. This completes the proof of Theorem 3-23. 

REFERENCES 

Section 3-5 

For a complete exposition of what is known concerning Fermat’s con¬ 
jecture, see H. S. Vandiver, “Fermat’s last theorem: the history and the 
nature of the results concerning it,” American Mathematical Monthly 63, 
555-578 (1946). The result of Lehmer, Lehmer, and Vandiver was an¬ 
nounced in Proceedings of the National Academy of Sciences 40, 25-33 
(1954). Landau gives a proof of Kummer’s lemma; see his Vorlesungen 
iiber Zahlentheorie, vol. 3, Leipzig: S. Hirzel Verlag, 1927. 

Section 3-6 

The equation y^ = + k was the subject of L. J. Mordell’s inaugural 

address, A Chapter in the Theory of Numbers, New York: Cambridge 
University Press, 1947. Also see Dickson’s History of the Theory of Num¬ 
bers, Washington: Carnegie Institution of Washington, 1919; reprinted, 
Chelsea Publishing Company, New York, 1950; vol. 2, pp. 531-539. 

Section 3-7 

Dedekind’s fundamental paper on pure cubic fields is in Journal fur die 
Reine und AngewamUe Maihematik (Berlin) 121, 40-123 (1899). 

Sections 3-8, 3-9 

We have followed the treatment by Nagell, Journal des MathSmatigues 
Pures et Appliquies (Paris) 4, 209-270 (1925). Delaunay {CompUs Rendus 
Hebdomadaires des Stances de I’Acadimie des Sciences (Paris) 171, 336 
(1920) and 172, 434 (1921)) announced that equation (28) has at most five 
solutions in case c = I. His work on (29) was announced in Comptes 
Rendus 162, 150-151 (1916). 



CHAPTER 4 


THE THUE-SIEGEL-ROTH THEOREM 

4-1 Introduction. It is shown in introductory texts in number 
theory* that if a is a quadratic irrationality (that is, an algebraic 
number of degree two), then there is a positive constant c such that 

p c 

Q T 

for every pair of rational integers p, q with g > 0. The idea used 
there suffices to prove the following generalization, which is due to 
J. Liouville. 

Theorem 4-1. If a is an algebraic number of degree n > 2, then 
there exists a positive consUint c such that 



for every pair of rational integers p, q with g > 0. 

Proof: Let a be a zero of the irreducible polynomial 

/(x) = Oox" H-+ a„, Oo > 0, 

with coefficients in Z, and let ai = a, a 2 ,..., be its conjugates, 
so that 

/(x) = ao(x - a)(x - ora) • • • (x - a„). 

Then the number 

9"/ = flop” + aip"“^9 H-h a„g" 

is a rational integer different from zero, and it therefore has absolute 

* See for example, Volume I, Section 8-4. In Section 8-5 Hurwitz’ 
theorem is stated and proved, and in Chapter 9 the problem of approxi¬ 
mating real numbers by rationals is considered; all this material is assumed 
in the present section. 


121 



(chap. 4 


122 


THE THUE-SIEGEL-ROTH THEOREM 


value at least 1. Hence 


V 

a -- 

9 




Co?” n a* - Oo?" ii ajk - ^ 


Put 


k’^2 


k~2 


^ = max (|al, . . ., |o„|). 

We consider two cases, according as \p/q\ is greater than 20 or not. 
In the first case we have the trivial lower bound 


a-- 

q 


In the second case the inequality [a* — p/q\ < 3/3 holds for 
k = 2,n, and, by the inequality of the preceding paragraph, 


V ‘ 

a-^\> 


aoq^(B0) 


n—I 


Thus, the theorem holds with 




Liouville used this theorem to show the existence of nonalgebraic 
numbers; this will be discussed in detail in the next chapter. At the 
moment, let us consider a hypothetical improvement of Theorem 
4-1, in which the inequality (1) is replaced by 



where v is any number smaller than n. A. Thue noticed that if 
such a theorem could be proved, it would have the important conse¬ 
quence that the Diophantine equation 

g"/ = Oop" + aip"“*9 H-+ ^ (3) 

can have only finitely many solutions for any fixed rational integer A 
different from zero, if f(x) has distinct zeros. To see this, let the zeros 
of fix) again be ai = and put 


y = min (|a,- — a/|). 
•V; 



INTRODUCTION 


123 


4-1] 


Suppose that (3) has infinitely many solutions p, q. Then there 
must be at least one a.-, which by suitable naming we can take to be 
a, which is a limit point of the numbers p/q, since otherwise the 

quantity 


is certainly not bounded as q increases indefinitely. There 
must therefore be infinitely many solutions of (3) for which 
\a - p/q\ < 7 / 2 . But for all such solutions, 


p A ^ A \ 

009 " n 

(b -2 

p 

- Ok 

? 

- q- 


and this is at variance with ( 2 ) if i' is a constant smaller than n and 
q is sufficiently large. 

Thue showed that (2) holds with 


V 



Later C. L. Siegel improved Thue's result, showing that (2) holds 
with 

v> min (—TT + SJ' 

\S + 1 / 

iCZ 

and in particular with v = 2\/n. In 1947 F. J. Dyson made the 

further improvement v > \/2n, and finally in 1955 K. F. Roth 
proved that ( 2 ) holds with v = 2 + €, for each « > 0 , for all but a 
finite number of fractions p/q. This is the best theorem possible if 
V is to be independent of q, since Hurwitz’ theorem shows that the 
corresponding statement is false for every irrational algebraic number, 
for V = 2 and suitable c. Roth’s work is similar in some respects to a 
simplification of Dyson’s proof, published by T. Schneider in 1948. 

In addition to the problem of sharpening Theorem 4-1 by decreas¬ 
ing the exponent of q, we may also consider the question of extend¬ 
ing the methods so as to analyze the approximability of an algebraic 
number by other algebraic numbers. This is not mere generalization 
for its own sake: as we saw in the preceding chapter, it is natural to 



124 


THE THUE-SIEGEL-ROTH THEOREM (CHAP. 4 

consider the solvability of in a larger set of integers 

than Z, and the same is true of many other Diophantine equations. 
But if the variables in an equation range over the integers of an 
algebraic number field, then to the extent that approximation 
theorems are useful at all they must be formulated in terms of alge¬ 
braic rather than rational numbers. 

While Siegel gave many algebraic variants of his basic result, Roth 
presented a detailed proof only in the rational case. In this chapter 
we give a complete proof of a useful algebraic version of Roth’s 
theorem. Unfortunately, the proof is complicated; the student 
might profit by first examining Schneider’s work mentioned above. 

We shall proceed as follows. In the next three sections we shall 
make some definitions, and obtain some preliminary results, which 
are needed for the proof of the main theorem: in Section 4-2 some 
properties of polynomials will be treated, in Section 4-3 the concept 
of the generalized Wronskian will be introduced, and in Section 4-4 
the index of a polynomial will be defined and discussed. Then we 
shall proceed to prove, in Sections 4-5 and 4-6, several lemmas on 
which the proof of the main theorem depends, and finally, in Section 
4-7, we shall state and prove the Thue-Siegel-Roth theorem itself. 
In the remainder of the chapter, some applications of the theorem 
will be taken up. 

4-2 Polynomials. If P{z) is a polynomial with arbitrary complex 
coefficients, we denote by |1P|| the maximum of the absolute values of 
its coefficients. If a is an algebraic number and P{z) = 0 is its de¬ 
fining equation, so that P is irreducible and has relatively prime 
coefficients in Z, we define the height H{cx) to be 1|P||. Finally, if P has 
algebraic coefficients, we designate by the maximum of the 
absolute values of their conjugates. Clearly |[P|| = (^ if P has 
coefficients in Z, and for a nonzero constant polynomial P( 2 ) = a the 
new definition of Q agrees with the old one. 

Except when a polynomial is written as a determinant, it will be 
supposed that no two terms have the same exponents on the variable, 
or sets of exponents on the variables. 

Theorem 4-2. Let f, Xi,. . . , X* he complex numbers, and put 

h 

L(z) = f n (2 - 



4-2) 


POLYNOMIALS 


125 


Then 


n (1 + ixti) < 

km\ 


Proof: There is no loss in generality in supposing that I - 1, since 
a change in I affects in the same way the two sides of the inequality to 
be proved. Let Xj,... , X* be those of the X's such that lX*| < 2. If 

/{z) = n (z - Xifc), then there is a complex number Zq with |zol = 1 

for which l/(zo)l > 1- To see this, let € be a (i + l)th root of unity, 
and suppose that ^ 

m = 1 . 


/(Z) = L 

r-0 


Then 


But 


r-O »“0 r-O r-0 r-0 

^ r(H-i>-l 0 if (i + l)l(r + l), 

h "1< + 1 if « + l)l(r + l). 


(4) 


and since r < <, (< + l)l(r + 1) if and only if r = <. Hence 


L = («+ Omi “ < + 1 , 

r-0 


so that one of the < + 1 numbers !/(«')! is at least 1. Thus 


n (1 + |Xkl) < (1 + 2)» = 3* < 3* 

i-L 


n (zo - Xfc) 

ik-1 


(5) 


If t < h, then for fc = t + 1,. . . , X we have IXjkl > 2 and 


1 + |Xt| 1 + M ixtl + ^ _ 1 , 

\zq — \k\ IXfcl — Izol IXfcl — 1 

so that 


IXfcl - 1 


< 1 + 


2-1 


= 3, 


fc-c+l 

Combining this with (5), we have 


il (zo - Xfc) 


n (1 + ix»i) < 3 * 

k~i 


n (zo - X*) 

&-1 


< 3*|lLll(|zol^ + • • • + 1) 


= 3^* + !^ <6*11L||. 



126 


THE THUE-SIEGEL-ROTH THEOREM 


(chap. 4 


Theorem 4-3. Suppose that f{z) and g(z) are polynomials with 
complex coefficients, of degrees n and m respectively. Suppose further 
that the coejfficient of in g{z) has absolute value at least 1. Then 


Proof: Let 


I/ll < Q^-^\\fg\l 


f{z) = Oaiz - Xi) • • • (2 ~ X„), 


g{^') — ^n+l) ■ ■ ■ (2 Xn+m)* 


Then 


oo n (2 + iXifci) 


< |aO^>ol • n (1 + |X*|) 

ifc-1 


< \aobo\ "ll (1 + 

ib-l 


and the desired result follows from Theorem 4-2. 


Theorem 4-4. If f{z) is an arbitrary polynomial of degree n, 
with real coefficients, then 

11/ir < (mn + Dlirll- ( 6 ) 

Proof: Let f{z) = Oq + aiz + ■ ■ • + and let l|/l| = a. The 
theorem is certainly true if either [ool — a or lanl = a, since the first 
and last coefficients in/”*( 2 ) are the mth powers of Oo and a„, respec¬ 
tively, so that in this case ||/”|| > ll/ll"*- W we put 

then clearly 

1|/*1| = 11/11 and l|(/*rl| = IKDII. 

SO that we can suppose, with no loss in generality, that the numerically 
largest of all the coefficients in f{z) is at, where ^ < i <n. 

Put =/(2) - 

and let a = a(6) be the numerically largest of the zeros of g(z, 0) for 
each 0. The inequality (6) holds if, for some 6, \a{e)\ > L For 

|/”*(a)| = lae'Vr = 


while for |al > 1, 

i/”-(c<)i < iiriid + ki + • • • +1“!””) ^ 



4-2) 

so that 


POLYNOMIALS 


127 


oT = 11/ir < + 0- 


We know that M < a. Hence if /(I) > a, then 
g{\,0) > o - a = 0, fif(®,0) = 


and 1 < la(0)1 < “o. Similiarly. if/(I) < -a, then 

£?(1, ir) < -a -H a = 0, tt) = CO, 

and 1 < la(T)l < CO. This proves the theorem unless 1/(1)1 < a, 
which we henceforth assume. 

Now put 2 = e^, so that 

9{e\ e) = fie^) - 


If we and a ¥>0 such that = a, then Oq can be determined so 

that flo) = 0; this gives laC^o)! > 1 and proves the theorem. 

Since l/(e^)l is a continuous function of <f>, and since I/(l)l < a, it 
sulSces to prove the existence of a tpo such that > c. 

Let « be a primitive (( -b l)th root of unity, where la,l = a and 
^ < < < n. Then 


z = 


>0 


z^ 

r-O 


Z ajk« 

k-O 


pk ^ 


L a, Z 

k-0 r-0 


Since fc < n and f > ^, we have that (< + l)l(fc + 1) if and only if 
k ^ t. Hence, by (4), 

Z *7(e0 = + 1), 

p-O 


so that for some v, 


|e-'/(c*')| = lAe")! > la,l = a. 

The proof is complete. 

Theorem 4-5. If fiiz), ... ,/«( 2 ) are polyiumiah with algebraic 
coefficients, then 


n f, 

Pml 


< fl (1 + deg/,) fl [TH • 


p-i p**i 

Proof: There is no loss in generality in supposing that 


deg/i > deg /2 > • • • > deg/*. 



128 


THE THUE-SIEGEL-ROTH THEOREM [CHAP. 4 

The product / 1/2 is a poIjTioinial each of whose coefficients is a sum of 
products of a coefficient of /i and a coefficient of fz, the number of 

summands being at most 1 + deg/a. Hence 

, IT;^ = a + deg/ 2 )[^[^. 

bimiJarly, 

<(1 +deg/3) f/iyil [Til< (1 + deg/3) (1 + deg/3) 
and so on. 

Theorem 4—6. Let p and r be positive integersy with 1 ^ r < p. 
Suppose that F{zu Zp), Gizu z^), and H(Zr+i ,.. . , 2 p) are 
polynomials with coeficients in an algebraic number field K, those of 
F being integers, and suppose that 

F{zi, . . . , Zp) = G(zi, . . . , Zr)H {Zf^i, . . . , Zp). 

Then if y is any coefficient in F, there is a factorization y = afi in K 
such that the coefficients in aH and $G are integers in K. 

Proof: Let the coefficients in G be ai,..., a,, and those in H be 
0i> ‘ • t 0t, in some order. Then, since the variables in G and H are 
disjoint, the coefficients in F are simply the products Since the 
coefficients in F are integers, all the products ai0i,..., ai0i are 
integers, as are all the products /3jOj,.. ., 0ja,. But these two sets of 
numbers are just the coefficients in aiH and /3iG. 

4-3 Generalized Wronskians. Polynomials fo(zi, •. • i ^p). • • •» 
fi-i(zi,... ,Zp) with coefficients in an algebraic number field K are 
said to be linearly dependent if some linear combination of them, with 
constant coefficients in K which are not all zero, vanishes identically, 
and are othenvise said to be independent. In the case of a single 
independent variable, it is well known that the question of independ¬ 
ence of a set of functions can sometimes be settled by reference to 
t’leir Wronskian. For our purposes it is convenient to define this as 
the determinant 

W(z) = det ' fi,y = 0,1,... ,l - I, 

which differs from the usual definition only in the presence of the 
nonzero constant factor 

1 

Oil!-- - G - 1)!* 



GENERALIZED WRONSKIANS 


129 


4-3) 


The exact relation of the behavior of the Wronskian to independence, 
as applied to polynomials, is indicated in the first part of the next 
theorem. 

For functions of several variables, the situation is not quite so 
simple, since there are then several partial derivatives to consider. 
We proceed as follows. Let Ao, Ai,.. . , A^, . . . , A/_i be differential 
operators of the form 

1 (±Y ■ •. (—Y . 

ii! • • ip! \dzi/ \dZp/ 


such that the order ji + ip of A^ does not exceed n, for 

0 < ft < I — 1. Then the function 


G{Zif .. • I 2p) 


Ao/o ^o/i 

^i/o ^i/i 


Ao/z-i 

Ai/i_i 


A/_t/o Aj_i/i . . . Af_i/<_i 


is called a generalized Wronskian of/o,.. . ,/i—i- Except in the trivial 
case p = I = I, there are several A^’s for each n, and hence more 
than one generalized Wronskian. In the case of functions of one 
variable, the ordinary Wronskian is that generalized Wronskian for 
which the order of A^ is exactly m. for 0 < m < ^ — L 

Theorem 4-7. (a) If fo ,... ,fi-i are I polynomials over K in the 
single variable z, whose Wronskian W{z) vanishes identically, then 
they are dependent over K. 

(b) If /o. • • • ./i-i <zre I polynomials over K in the variables 
Zi,.. . , Zp, for which every generalized Wronskian Gi(zi, . . . ,Zp) 
vanishes identically, then they are dependent over K. 

Proof: (a) The proof in this case is by induction. If Z = 1, then 
W{z) = fo{z), and the truth of the theorem is obvious. 

Take I > 1, and suppose that the theorem is true for every set of 
Z — 1 polynomials, /o, /i,... ,fi- 2 , over K] suppose also that the 
Wronskian Wi of /o,... ,/i_i vanishes identically. If /o,.. . ,/j _2 
are dependent, so are /o, •••,//—i, and the assertion is proved. 
Suppose then that /o,. . . , //_2 are independent, so that their Wron¬ 
skian Wj_i is not identically zero. Now WVi, being a polynomial, 
has only finitely many zeros; let I be an interval in which it does not 
vanish, and take z in I. For such z, the system of equations 



130 


THE THTJE-SIEGEL-ROTH THEOREM 


[chap. 4 


k^O 


j = 0,1,... ,l - 2, 


(7) 


can be solved for the t/’s as rational functions of z. But then, by sub¬ 
tracting appropriate multiples of each column of Wi from its last 
column, we obtain 


0 = 

I!--, (i- 

l)\Wi 



fo(z) 

/o'(2) 

4 

fiiz) 

fiiz) 

4 

0 

0 

# 


9 

9 

# 

9 

# 

1-2 

... fl'-i" {z) - T. h^‘-'H^)yk 


= 1 ! 
so that also 




A-0 


( 8 ) 


j^0,...,l-2. 


Differentiating (7) gives 

Z (‘)<Jk + L Sk’' (i)yk' = /I'-i" {2). j = 0,2, 

fc=0 fc=0 

and comparison of this for j = 1 — 2 with (8), and for j = 0,..., i " 3 
with (7), shows that 

Z fk^^Hz)yk - 0, 
k~0 

Since Wi-i 9^ 0, it must be that 

vo^ - ‘ = y'^2 = 0 , 

so that the y's are constants, say y* = c*, and they are clearly m K. 
But then the polynomial 

'zckhiz) -Sl-l{z) 

*-0 

vanishes throughout I, and therefore identically, so that the I poly- 

nomials/o./i,..are dependent. ^ 

(b) This case is proved by contradiction. Suppose tha 
polynomials fo(zi, . • • ,Zp), ■ ■ ■ in epen 






4 _ 3 ] GENERAUZED WRONSKIANS 131 

and suppose further that for each f, /, is of degree less than k in each 
of its arguments, so that we can write 

k-l 

f ,{ 2 u ..., 2 p ) = L • • • T . ^*-(^*1 . 

il-O *,=o 

Then the polynomials /,(!, .... are linearly independent. 

For otherwise there would be an identity in ( of the form 

L c. ‘e ■ • • e' b,(h .= 0, 

»-0 * 1-0 *,-o 


or 


*_i fc-i /j-i \ 

S S ^ C»br(^l» • • • » ^p) ) ^ 
* 1-0 *,-oV-o / 





and it would follow from the uniqueness of the representation of an 
integer to the base k that for each set of exponents ki . kp, 


whence 


j-i 

^ CfbpiJCly .. • , kp) 0, 

»-o 

i-i 

L ... ,Zp) = 0, 

r-O 


contrary to assumption. 

We know therefore that the Wronskian 

W(0 = det /r(<, i*,..., i**"*)) ' M, *' = 0,..., i - 1, 

does not vanish identically. By a standard differentiation formula, 

j PA (it^^ ^ 

jMt .= E —A(*i..... a,) , 

and it follows easily by induction on m that an operator identity 


5;) =p,(0A<‘> + ...+Vr(0A''> 

41 


holds, where ..., are differential operators of orders not 
exceeding n, r depends only on n and p, and , v?r are poly¬ 

nomials with rational coefficients. Using this in the above expression 



132 THE THUE-SIEGEL-ROTH THEOREM (CHAP. 4 

for W{t), and writing the resulting determinant as a sum of other 
determinants, an expression for W {t) of the form 

W{t) = .... + • • ■ + .... 

results, in which .. ., are polynomials and (?i,..., (z* are 
generalized Wronskians of /i,... Since W{t) does not vanish 

identically, there is an i for which Gi{t ,..., is not identically 
zero, and a fortiori Gi{z \,..., Zp) is not identically zero. 

Theorem 4-8. Let R{zi ,..., Zp) be a 'polynomial in p > 2 vari¬ 
ables, “with integral coefficients in K such that 

Q <lR\<B. 

Let R be of degree at most Tj in Zj, for j — 1,..., p. Then there is 
an I in Z with 

l<l<rp + l, (9) 

there is an integer /3 in K, and there are differential operators 
Ao,.. . , Ai_i on the variables Zj,..., Zp_i, of orders at most 
0,... ,l — 1 , respectively, such that if 

F(zi,... ,Zp) = 0(kt(^Ap ^ = ( 10 ) 

then 

(a) F has integral coefficients in K and is not identically zero; 

(b) a decomposition 

FiZi,...,Zp) = C/(zi,...,z^i)l^(3p) (11) 

holds, where U and y have integral coefficients in K, U is of degree 
at most Irj in Zj for j = 1,..., p — 1, and V is of degree at most 

Irp in Zp; 

(c) the following boUTid holds: 

< l (r, +. 1 ) ■ ■ • (r, + 

Proof: Write R as a polynomial in Zpi 

fp „ 

R{z\,... ,Zp) = 'S*(ri,...» Zp-i)Zp . 

The polynomials S„ need not be independent; let if,{z \,. - •, 2 ^ 71 )' 
for p = 0 , . . . , i - 1 , be a maximal set of independent polynomials 



GENERALIZED WRONSKIANS 


133 


4—3) w 

among the S„ so that 1 < ? < rp + 1. Then there are constants 
in K such that for x = 0.Tp, 

i-i 

5,(^1,..., Zp^i) = L • • •» ^p-i)- 02) 


If we put 


• y 

<Py{Zp) = ^ f V — 0, .4.,/ 1, 


-0 


then 


i-i 


(13) 


R(Zif ... I 2p) —• ^ ^y(.Zlt • • • I ^y(.Zp')f 

y “0 

and ipo,. . . y tpt-i are independent. For if Sq, . . - , are constants 
such that 

Wo(2p) + • • • + 5/_i^/_i{2p) = 0, 

the coefficient of each power of Zp must be zero, so that 

5o/3o« d“ • ■ • + 5/—1.» = 0 (14) 


for X = 0,..., Tp. For fixed vq with 0 < vo ^ ^ “ 1. choose xq so 
that S«,(zi,. .. ,Zp_i) = \^'.o(2i. ■ ■ • .2p-i); this is possible since 
the ^’s are a subset of the iS’s. Then (12) shows that 



1 if V = vq , 

0 if V ^ vq. 


Choosing x = xq in (14), we obtain 6,,^ = 0. Since vq is arbitrary, 
every 5^ = 0. 

Let IF(Zp) be the Wronskian of ^.it is a polynomial 

with coefficients in if, and it does not vanish identically. Let 
G(zi,..., Zp_i) be some generalized Wronskian of ^o. • • •. 
which is not identically zero. Then 

W(zp) = det(^(^') v».(zp)y 

\ii\\dzp/ } ;,,p = 0, 1, 

(?( 2 i,.. -, Zp_i) det (^^p}pw(.zif • •., Zp_i)), 

where Aq, ..., Ai_i are differential operators on Zi,..., Zp_i, of 
orders at most 0,..., 1 — 1 respectively. Taking the row-by-row 
product of G and TT, we obtain 

GIT = det ((^) iM .^^.)) , 



134 


THE THUE-SIEGEL-ROTH THEOREM 


[chap. 4 


or Cr = det(A„i(0/e). (IS) 

Since is a determinant of order I whose elements are polynomials 
in Zp of degrees at most rp, it is clear that deg W < Itp. Similarly, 
G is of degree at most Irj in for ; =* 1,..., p — 1. 

In the expression (15) for GW, we can write R as the sum of 
(^1 + 1) • • • (rp 4* 1) terms of the form 

The determinant can then be written as a sum of 

((ri + l)---(rp + l))' 


new determinants, each having entries of the form 



in which tj < Sj for j = 1,..., p. Here 



fe) • • • (t) - 


Thus the entries of each new determinant are such that the maxima 
of the absolute values of their conjugates do not exceed 


and hence 

1^ < ((ri + 1) • • • (rp + 

The coefficients in GW are integers in K. It foUows from Theorem 
4-6 that if /9 is any one of them which is not zero, there is a factori¬ 
zation /3 = /Si/ 32 in ^ such that /3i(? = U and $ 2 W = V have integral 
coefficients in K, and 

0GW ^ F = UV. 

By the bound just obtained for we have 
0 < \F\ < < (in + 1) ■ • • (rp + 

4-4 The index. Let F(zi,. -., Zp) be any polynomial in p vari¬ 
ables which does not vanish identically. Let ai,..., ap be any 
complex numbers, and let ri,..., Tp be any positive num ers. e 



THE INDEX 


135 


4—4] 

define the index 8 of P at the point (ai, - . . , Op) relative to fi, . . . ,rp 
as follows. Expand P{ai + l/i,. . • , «p + Vp) ^ polynomial in 

Vif ♦ • •) ypi 


P{cti + yi, . . . , ap + J/p) - H 

>1 -0 

Then 


L c(ii> • • • 

/,-o 


6 = nun I — 
Vi 


+ ••• + 


i) 

Tv/ 


the minim um being extended over all sets of non-negative integers 
ii,.... ip for which c(ii,...»ip) 5 ^ 0 , or, equivalently, for which 

. 

Note that 5 > 0 always, and that 5 = 0 if and only if 

P(ai,..., Op) 7^0. 


Moreover, if any derived polynomial 

( 0 ' ^ - - ( 0 - 




is not identically zero, it is clear that its index at (ai,..., ap) rela¬ 
tive to fi,..., fp is at least 



The following properties, which we list in a theorem for later refer¬ 
ence, are also immediate consequences of the definition. 


Theorem 4-9. Let P{zi,...,Zp) and Qfzi, ...,Zp) be poly- 
nomiah, neither of which vants^s identically. Then if we consider 
indices formed at the same point (ai,..., ap) relative to the same 
numbers ri,..., rp, the following relations hold: 

index (P + Q) > min (index P, index Q), (16) 

index PQ = index P -|- index Q. (17) 

Equation (17) remains true if P is a polynomial in Zi,..., 2 p_i only, 
and Q is a polynomial in Zp only, provided that the index of P is 
takers at (ai,..., ap_i) relative to fj,..., rp_i, and that of Q at 
Op relative to rp. 




13G 


THE THUE-SIEGEL-ROTH THEOREM 


[chap. 4 


Now let ri,..., be positive integers, and suppose that 5 > 1. 
We consider the set n, . . . , r„) of polynomials 

^(2i. • • • , Zm) which satisfy the following conditions: 

(a) R has integral coefficients in K, and is not identically zero. 

(b) R is of degree at most ry in Zj, for; = 1,..., m. 

(c) fTTl < B. 

Let i*!,... , be algebraic numbers (not necessarily in K) of 
heights /f(i'i) = q\, = ?m- Let d{R) denote the index 

of R{zi, ... , 2 „) at the point (i-i,. .., U) relative to n,..., 
Our object in the present section is to obtain, under certain condi¬ 
tions, an upper bound for 6 {R) in terms of B,qx,..., q„, n,..., rm. 
We therefore define 


qi, . . . , 9 «;ri, . . . , rm) = supe(R), (18) 

the supremum, or least upper bound, being taken over all 72 in 
and all integers 0. • • •» of heights qi,.. ., qm, respectively. 

The double significance of ri,.. ., rm in the definition (18) should 
be noted; these numbers occur both in the definition of the index 
and in condition (b) above. 

We proceed by induction on m. In Theorem 4-10 the case m = 1 
is treated, in Theorem 4-11 there is given a recurrence relation 
between 0^-1 and 0m, and in Theorem 4-12 an explicit bound is 
obtained. 


Theorem 4-10. 

Ot(R;qi;ri) 


^ 3N(JV + 1) ^ /V log B 
~ log qi l(^q 


Proof: Let the defining polynomial of be 

x{z\) = do2i^ 4- • ■ ■ + do 7^0, 

where do,...,dh are relatively prime rational integers, so that 

= HUi) = = niax (Idol, • • • . \dh\)- 

Each polynomial R in has integral coefficients in K; regarding 
these coefficients as polynomials in a single primitive element, we 
can obtain other polynomials from R by successively replacing this 
primitive element throughout by its various conjugates. Let R 
be the product of these N polynomials. By the Symmetric Function 



THE INDEX 


137 


4^) 

Theorem, i2* has coefficients in Z. Also, degi?* = Nri, and by 
Theorem 4-5, 

l|H*ll < (1 + 


By the definition of the index, R{zi) is divisible by (zi - 
and the same is therefore true of /?*(zi). Since /?*(zi) has coeffi¬ 
cients in Z, it is divisible by x’’*®- One consequence of this fact is 
that hrid < Nri. Also, it follows from Theorem 4-3 that 


and, by Theorem 4^, 

= lUir^* < (AM + 1)6^^‘1!72*|1 


Hence 


e < 


jy(jv + i)iogi2 ^ 
log qi 


N log B 

ri log^i 


and the theorem follows from the fact that log 12 < 3. 


Theorem 4-11. Let p >2 be a positive integer, let . he 

positive integers sudi. that 


r, > lOJ-*, —>«-*, for j = 2, . . ., p, (19) 

where 0 < 6 < 1, (ind let qi^. .. ^ qp be positive integers. Then 

QpiB; qi,..., qp;ri,... ,rp) < 2 Tmzx (4* + -|- 6^), (20) 

where the maximum is taken over integers I satisfying 

1 < i < Tp + 1, (21) 

and where 

4» = ei{M;qp]lrp) + ep_i(M;gi, . . . ,?p_i;iri, . . . ,fr;^i) (22) 

and 

M = {ri + 1)2p'22^ip^Z!2b2^. (23) 

Proof: Let Rizi, .. . ,Zp) be any polynomial of the class 
• • •» ^p) snd let fi,..., f’p be algebraic numbers of heights 

9 i. qp respectively. Then R satisfies the hypotheses of Theorem 

4-8, so that there are numbers I and /3 and a polynomial F{zi, . . . , Zp) 



(chap. 4 


138 


THE THUE-SIEGEL-ROTH THEOREM 


having the properties listed there. By Theorem 4-8, 

< ((ri + 1) • • ■ (rp + 

and hence 

< (ri + \ = M, 

since ri > r 2 > • • ■ > rp by (19). From the factorization 

F(zi, • • • > Zp) = U (zi, ■.., Zp—i)F(Zp) 

and the fact that the arguments of U and V are disjoint, it follows 
that also 

[u\ <M, W\ <M. 

The polynomial l}{z\,..., Zp—i) has degree at most hj in zy, for 
j = I,..., p — 1. It is therefore an element of the class 

9i.p_i(iif j iti,... j ivp —i). 

Hence, its index at (fi,..., Tp-i) relative to ?ri,..., Zrp_i is at most 

0p—Jl, . . . , Qp —1, Ifif • • • f 

It follows from the definition of the index that the index of U at that 
point relative to ri,..., Tp^i is at most 

i0p_j(Jlf j ji, • ‘ ‘ t 5p—1, ) ^rp_i). 

Similarly, F(zp) is an element of the class 9.i(M; Ifp), and its index 
at i'p relative to fp is at most 

lQi{M;qp;lrp). 

By the last sentence of Theorem 4-9, the index of F = UV at 
(fi,.. ■, fp) relative to n,..., rp is the sum of the indices of U and 

F, so that 

index F < 

where is defined in (22). ^ r c* • 

We now deduce from the determinantal representation of Z' in 

equation (10) a lower bound for the index of F in terms of the index 

$ of R. Consider first any differential operator of the form 


._ l^( 

I • • • tp_i! \ 


lY*' 

.dzj 


\dZp_-i/ 


ip-l 


w 


= t’l + • • • + fp-l ^ ^ 


of order 



4^1 

If the polynomial 


THE INDEX 


139 


i (— Y R(2,.Zp) 

v \ \dZp/ 


does not vanish identically, its index at (fi, ... , fp) relative to 
n,.. . , Tp is at least 

_ilnl _ Jl > 8 - - 

fl ^P—1 

Now 




w 


I - 1 


< «. 


Tp—i rp_i Tp—i 

by the inequalities (21) and (19). Hence, since the index is non¬ 
negative, it must be at least 


max 




If we expand the determinant on the right side of (10), we obtain 
for F a sum of I! terms, a typical term being 

fi 4 ■■■ t) ’ 

where Apo. ■ • • i are differential operators on Zj,. . . , Zp_i whose 
orders are at most i - 1. By Theorem 4-9, the index of such a term, 
if it does not vanish identically, is at least 

Y. max fo, 8-J — U. 

*-0 \ rp/ 

Since F is a sum of such terms, it follows from Theorem 4-9 again that 

index F > Y max ^0, 6-^ — li. 

,m0 \ ^pf 

We may suppose that 8 rp > 10, since otherwise 

e < 10rp“* <6 <26^ 


and the desired inequality for 8 then holds. Under this supposition, 
[9rp]^ > Hence if frp < I, we have 



140 


THE THUE-SIEGEL-ROTH THEOREM 


(chap. 4 


[9rp) 

Y, i^Tp — P) 

r=0 

> h-p-^erp]^ 

> ¥^rp, 

while if drp > I, then 

Y max ("o, 5 - —^ = Y (d - ^ >^ld. 

r=0 \ ^p/ r=0\ ^p/ 2 

Hence 

index F > min (|W, — Id. (25) 

Combining (24) and (25), we obtain 

min ^p^) < i(4> + 5). 

Thus either 6 < 2(4> + 5), in which case 6 satisfies the desired in¬ 
equality, or 

< i(4> + «) < (r, + IX* + S). 

Since rp + 1 < 4rp/3 by (19), this gives 

0 < 2(4* -h 5)i < 2($i + 


Y max (o, 0 - —^ = Tp”^ 

I'—0 \ 


and the proof is complete. 

Theorem 4-12. Let m be a ’positive integer, and suppose that 


° m2"(Ar + 1)“ ■ 



Let fi,..., 6e positive integers such that 

r™ > 105“S — > Vi = 2,..., m. 

O- 

Let qi,.,. ,qm^e positive integers such 

log q\> 25”* m(2#i + 1), 

O' V ^ V ?i» V i = 2,..., m, 

V?i >35-*Ar(Ar + l). 


Then 


Omiqi 


in 


qi, ■ • • f Qm’fJ'lt ■ • ■ } ^m) < 10 


(27) 

(28) 

(29) 

(30) 

(31) 



THE INDEX 


141 


4-4) 


Proof: The proof is by induction on m. For m = 1, we apply 
Theorem 4-10, together with the inequalities (30) and (26), and 
obtain 




^ 3A^(iV + 1) ^ iVlog(g/’~‘) 

log qi ri log qi 


< (A^+ 1)5 < 105^ 


which is the desired inequality. 

Now suppose that p > 2 is an integer, and that the theorem holds 
when m = p — 1. When m = p, the hypotheses of the present 
theorem are more stringent than those of Theorem 4-11, so that the 
latter is applicable here. We must estimate M and 4». 

We have 

M = (r. + i)2p'22npi;< ((r, + 


Since i<rp+l<ri + l< 2'\ it follow’s that 

M < (2^*P+^)nq26riy ^ (e(4p+2)ngj2iriy 

By (28) with m = p, we have 4p + 2 < 5p“^ log 91 , so that 

M < 

where 5i = 25(1 + p“^). (32) 

Thus 01 {M;qp; hp) < 01 ; qp ; hp) (33) 

and 

0p —1 (Jtf t • * *} qp —I (f^iI • • ‘ I —I) 

< 0p-i(9i^‘*'^;9i. • • • , qp-i;lru • • •, ^fp-i). (34) 


Moreover, (32), together with the inequality (26) with m = p, 
implies that 


5i < 


1 + p 


—i 


+ 1)=* (p - l)2^\N + 1)^ 

In particular, (N + l)5i < 5i^. 

It follows from (30), and the fact that qp > qi, that 

log > 3r^N(N + 1). 

Hence by Theorem 4-10, the right side of (33) does not exceed 
. . NStlrt log Qi , 

here we have used (29). 


(35) 



142 THE THUE-SIEGEL-ROTH THEOREM [CHAP. 4 

To estimate the right side of (34), we use the induction hypothesis, 
that the theorem holds when m = p — 1. The conditions of the 
theorem are satisfied for m = p — 1, if we replace 5 by 6i and 
ri,. .., rp_i by Iri ,..., since fii > 5, this is obvious for all 

the relations but (26), which has already been verified in (35). It 
follows that 

gi, .... qp_i-,lri, 

Hence, since < 45, the two results just proved imply that 
Finally, (20) gives 

0p(?i^’‘*;?i, ...irp) 

< 2 { 3 ( 10 ?-^ 5 '*^'"*) + + 5*1 

^ / 3 , 3* , 1 \ 


10P5<*>*’ < 10^5<*^^ 


4-5 A combinatorial lemma 

Theorem 4-13. // ri,..., r„ arc any positive integers, and X > 0, 
then the number A„(X) of sets of integers ji, ■ - • ,jm which sa(is/y 
inequalities 

0 < ^ Ti, • • • j 0 ^ Jm ^ ^mt 

ri Tm ^ 

does not exceed 

2m*X“^(ri + 1) • ■ ■ (rm + !)• 

Proof: We proceed by induction on m. The theorem holds for 
m = 1, since the number of integers such that 

0 < ji < n. ii ^ 5(1 ” 

is at most ri + 1, and is 0 if X > 1. , . 

Now suppose m > 1. The result is trivial if X < 2m*. since then 

the conditions on the individual's give an improvement of the desired 

upper bound. Hence we may suppose that X > 2m . If we fix jm. 

we must count the sets of integers ji,.. •, Jm-i such that 



4-5] 


A COMBINATORIAL LEMMA 


143 


0 < ii < ri, ..., 0 < j^i < fm- 


ri 


^m—I 2 \ Tfn j 


Putting 


m - X - — = (m - 1) - X', 


or, what is the same thing, 


X' = X'O'm) = X - 1 + 


we see that 


^fn(X) — ^ 1 (x^(jm)). 

Ai-0 


By the induction hypothesis, 

Am(X) < 2(m - l)J(r, + 1) ■ ■ • (r^i + 1) E fx - 1 + ^T' , 

. ^ /-o\ Tmf 

and it suffices to prove that 

51 ( X — 1 + —^ < X~*(m — l)“^m^(r + 1) 

y-o \ T i 

for all positive integers r and m, if X > 2m*. 

If r is even, we put j = ^ and obtain the sum 

i,.1, i(-?r-(>-?ri 

-+1»(»■ - 

k' 

< X-‘ + 2X E (X* - ir‘ 

A-1 

!'■ 

= X~‘ + 2X-‘ L (1 “ x-^r^ 

*-l 

^ X-'(r + 1)(I - X-2)->. 

Since 1 - X“* > 1 - m-V4 > (1 - m“M*, we have the desired 
Inequality. 



144 


[chap. 4 


THE THUE-SIEGEL-ROTH THEOREM 

If r is odd, we put i = (r - l)/2 + h and obtain the sum 
*--i(r-l)\ r / 

1 


<X(X2-ir'(r + l), 

and the result is as before. 

4—6 The approziinatioii polynomial. Let ot be an algebraic integer 
of degree n > 2 over K, so that a is a zero of a polynomial which has 
integral coefficients in K and which cannot be factored into a product 
of such polynomials of positive degrees. Let L = K(a) be the field 
obtained by adjoining a to K. Finally, let ui,..., be an integral 
basis for K, and put 

= 6i, max (foul,...» = 62. ( 36 ) 

In the remainder of the proof we shall be concerned with a single 
set of values of m, S, ?i, fi, . . - , Qm, , r„, which will be 

chosen later in the order just specified. The choice will be made so 


as to satisfy the following conditions: 

0 < 5 < m-^2r"{N + 1)“^ (37) 

10'"5fi)"’ -}- 2(1 + 35)nm^ < ^ » (38) 

> 105~^ — > 5~\ for i = 2, . . . , m, (39) 

n 

5 ^ log gi > 2m + 1 + ^ log (^*1 + 1) + 462A^, (40) 

Tj log qj > ri log qi, for j = 2, . . ., m, (41) 

log?i > Sr^NiN (42) 


Notice that these conditions imply those of Theorem 4-12, since (37) 
and (40) together imply that 5 log qi > 2m(2m + 1). 


ifr+i) 

= 2x1: 


• 1 ) / 

. - 


(2k - 1) 



4-6] 

THE APPROXIM.\TION POLYNOMIAL 

145 

Define X, p 

q, Bi by the equations 



X = 4(1 +35)nm^ 

(43) 


e< 

1 

II 

(44) 


r, = 

(45) 


Bx = 

(46) 

Then (38) is 

equivalent to 


Also, 

V < p. 

(47) 

< Bx, 


since “s/x < 

X -- 1 < [x] for all X > (3 + y/5)/2, and 



^ > g(2m+l)n > g3r„ ^30^ 

We come now to the main lemma, which will be the only one to 
which reference is made in the eventual proof of the Thue-Siegel-Roth 
theorem. 


Theorem 4-14. Suppose that the conditions (37) through (42) are 

satisfied, and suppose that fi. are algebraic numbers of 

heights qi,. . . , q„, respectively. Then there exists a polynomial 
Q(zi ,. .., Zm) with integral coefficients in K and of degree at most 
rj in Zjyforj — 1, .... m, such that 

(a) the index of Q at the point (a,... , a) relative to ri, . . . , is 
at least fx — rj; 


(b) Q(n, ....Tm) 9^0] 

(c) for all derivatives 



where ii,, im o-^e non-negative integers, the inequality 

holds, and the corresponding inequality also holds if the coefficients in 
Q are replaced by their respective field conjugates. 

Proof : Let Ci,..., cn range independently over the non-negative 

rations integers not exceeding Si, and let C be the set of integers of 
K of the form 





146 


[chap. 4 


THE THUE-SIEGEL-ROTH THEOREM 

The number of elements of C is (1 + and if we put 

(1 + ri) • • • (1 H- r„) = r, 

there are 

(1 + (48) 

distinct polynomials 

, Zm) =« Z • • • Z 7(Sl, . . . , * * * Zm*" 

*1=0 

whose coefficients y(si, belong to C. For 7 (si, ■ • •, s„) in C, 

ItCsi, ..., s„)l < h 2 BiN, (49) 

and if we put 



then 


since mri log 2 < log q\ by (40). Now replace all of Zi,..., Zm 
by a. Since the total number of terms is at most r, and since, by (40), 

r = (ri + 1) • - • (r„ + 1) < < (&i + 1)’"''^ < (50) 


we obtain the bound 



•im 





Let t? be a primitive element of L, so that L — R{d). Order the 
conjugates of d so that di,..., are real and and t^p^+pj+v are 
complex-conjugate for v = 1 ,..., P 2 , so that pi + 2 p 2 = nN. I^et { 
be a fixed one of the numbers Pyj...;»,(a,..., a), where j\,... ,jm 
satisfy the inequalities 

0 <yi<ri, 0<jfn<rm, -(51) 

'1 ’tn 


Then $ can be written as a polynomial in d, with rational coefficients, 



THE APPROXIMATION POLYNOMIAL 


147 


4-6) 


and as such has field conjugates « = 1, . . . , n.\. Hence we can 
define nN real numbers , ^n.v by the equations 




. Pi. 


for Pi 4- 1 < »- < Pi + Pi- 

Collecting them in a fixed order for fixed coefficients 7 (si, . . . , s,.,) 
and for all ji, . .. ,j„i satisfying the inequalities (51), we have a set 
of numbers which can be considered as coordinates of a point; by 
Theorem 4-13 there are 

M < 2nNmh-h 


coordinates, and each is numerically smaller than {b 2 N= 1. 
Thus all the points, for the various sets of coefficients in C, lie in a 
cube of edge 2t in iU-dimensional space. If each edge is divided into 
3/ equal parts, we get (30 '^ subcubes of edge By (48), if 

(1 + ' > (3() '', (52) 


there are more points than subcubes, and the points corresponding 
to two different polynomials P*{z \,. .. , z^) and P**{zi ,. . . , z,„) lie 
in the same subcube. If we put 

Piz\y . . . , Zm') 1 (.Zlf • • • , 2ni) P (Zi, . . . , Zm'ii 

then 

— ^ 2 
, a)l < V 2 ■ - < 1 

for ji,.. . ,jm as in (51). Since Pj^...j^{a ,.. . , a) is an algebraic 
integer whose norm is numerically smaller than 1 , it must be zero. 
Hence the index of P at the point (a, . . ., a) relative to rj,. . . , ?■,„ 
is at least p. Also the coefficients 7 (si, . . . , s,„) in P are integers of 
K, not all zero, such that the relation (49) holds. 

To verify (52), notice that by the inequality (40), 

> 4b2^V, 

and hence 

Bi > 4b2N, 

Bi^’’- > (462A'B,)i^'", 

Bi‘'^''‘ > + 3 )i vr(i+3j)^ 

(H-B,)^‘^> {30-^ 



148 


THE THUE-SIEGBL-ROTH THEOREM 


a xa [chap 4 

We now apply Theorem 4-12, the hypotheses of which are 
satisfied, as was noted earlier. Since P belongs to the class 
) > ^m)} its index at (^j, * • • j fm) relative to ri, . . . , 

is less than ij, defined in (45). Hence P possesses some derivative 

Q{Zi, . . . , Zm) — 7“"j J—: ( “ ) • 

kll’ ‘ • kmWdZi/ 

with 




*-1+ 

ri 


such that 


H- <v, 


Q(^l} • . . , i'm) 0. 


The index of Q at the point (a,..., a) relative to ri,..., r„ is at 
least — V‘ Thus Q has the properties (a) and (b) of Theorem 4-14. 
From the relations (49) and (50), 


< ^+-+^”h2NBi < < h2NB 


1+6 


Hence for an arbitrary derivative, 

Finally, 

m 

|Q,v..,„(z„ ..., < hNB,'+<“ n (1 + k.| + • • • + W") 


^-1 


m 


< n (1 + MV 




m 


(1 +wr, 

since b 2 N < Bi^ by (40). The same inequality holds for the con¬ 
jugate polynomials, and the proof is complete. 

4-7 The Thue-Siegel-Roth theorem 

Theorem 4-15. Let K be an algebraic number field of degree N , 
and lei a be algebraic. Then for each x > 2, the inequality 


a-t\< 


1 


(Hit)) 


( 53 ) 


has only finitely many solutions f in K. 



THE TH0E-SIEGEL-ROTH THEOREM 


149 


4-7) 


Proof: We shall suppose that the theorem is false, so that (53) has 
infinitely many solutions, and produce a contradiction. We may 
suppose also that a is an integer. For if not there is a positive 
rational integer a such that aa is an algebraic integer, and for each 
solution f of (53) we have 

(H(f))" - (//(a!-))*' 


Hence for arbitrary « > 0, and for all solutions f with //(f) suffi¬ 
ciently large, 

(//(af))-- ’ 


and e can be chosen so small that x — € > 2. 

Finally, it suffices to prove that (53) has only finitely many solu¬ 
tions in primitive elements f of K. For an algebraic number field 
has only finitely many subfields, and every element of /C is a primitive 
element of some one of its subfields; moreover, the inequality in ques¬ 
tion does not depend on the degree of a over K. 

We first choose m so large that m > 4nm^ and 


2 m 



< X, 



which is possible since x > 2. For sufficiently small 6 we have 

m — 4(1 + 35)nm^ — 2i? > 0, 


where v, given by (45), becomes arbitrarily small with 5. This 
condition is the same as that of (38). We choose 5 to satisfy this 
and the inequality (37), and finally the inequality 

2m(l-Hg) + 2gAr(2 + 5g) 

m - 4(1 + 35)nm^ - 2 i7 ^ 


which is possible in view of (54). The inequality (55) is equivalent to 


m(l +5) + aA^(2-h 55) 
n — V 


< X, 



by equations (43) and (44). 

Having chosen m and 5, we now choose a solution fj of (53) (a 
primitive element of K) with //(fO = and mth qi so large as to 



150 


THE THUE-SIEGEL-ROTH THEOREM 


(chap. 4 


satisfy (40) and (42). e then choose further primitive solutions 
t 2 , - U of heights ^ 2 , • - • , Qm, such that for j = 2,_ yn 




log Qi ^ 2 
log 9 j_i 5 


We now take ri to be any integer such that 


ri > 


10 log g, 
8 log qi 


and define ry, for j = 2,, ju, by 

log q\ ^ ^ r, log qj , , 

1 ^ O' “i-f" 1 • 

log qj log qy 

Then the inequality (41) is satisfied. Also, 

O log 9/ ^ 1 _j_ log Qj < j ^ log 9m 


log Qi rj log qi 

by (58). The conditions (39) are satisfied, since 


1 ‘ ^ ^ Tn ’ 

r, log^i 10 


(57) 


(58) 


(59) 


(60) 


and 



n log r /1 
logq„ 


> 108-^’ 




log qj 
log qj-i 





by (59), ((50), and (57). 

We know from Theorem 4-14 that there exists a polynomial 
Q(zi, . . . , Zm), whose properties are listed in that theorem. Let 
i"!, . • • »I'm io be zeros of irreducible polynomials of degree N with 
reiatively prime coefficients in Z, the coefficients of z‘'^ being 
^ 1 , ..., km, respectively. Then the number 

<P — Qitl, • • • » i*m) 

is an element of K. If the field conjugates of are I",', »• ■ •»fo*" 

i = 1,.... m, then is a sum of products of powers of the 
with integral coefficients from K, and in each such product a factor 
occurs to the power r,- at most. In the proof of Theorem 2-21, 
it was shown that the product of A-,- and any set of distinct conjugates 
of ^ is an algebraic integer. For each i, the field conjugates of fi 



4-71 


THE THUE-SIEGEL-ROTH THEOREM 


151 


are distinct, because is a primitive element of K. It follows that 

is an algebraic integer, and since it is also rational 
it is a rational integer, so that 

> 1. (61) 

On the other hand, we have 
Q(fl» • • • » fm) 

= L ■ ■ ■ £ Q«—im(«> • • • » • • • (fm — a)*", 

M*0 im^O 

and, by part (a) of Theorem 4-14, the terms with 



V 


all vanish. In all other terms we have 

ICfi - • • • (fm - a)'"! < • ■ ■ qj-r’‘ 

since > qi by (41). Hence, using part (c) of Theorem 4-14, 
we have 


M < (ri,+ 1) • ‘ • (r„ + l)5i'+3«(i _j_ -r.o.-,)* 

and by using part (c) again, together with Theorem 4-2, we obtain 

1*1< B/+®®^j~n0i-'j)«5^(w-i)(i+3a) 

AT 


X n 


ki n (1 + 
; -1 


m 


Now, by (50), 

6iVh+-+..) < 

so that 

< gj*^n(2+6i)+mri(l+a)—»!(#*—•»)* 



152 


(chap. 4 


THE THUE-SIEGEL-ROTH THEOREM 

This, together with (61), implies that 

5N(2 + 55) + m(l + 5) > — ^)x, 

or 

^ m(l + 5) + 5A'(2 + 55) 

X <. , 

n - V 

which contradicts (56). This completes the proof. 

4-8 Applications to Diophantine equations. The Thue-Siegel-Roth 
theorem will now be applied to show that a rather large variety of 
Diophantine equations have only finitely many solutions. 

Theorem 4-16. Lei U(x, y) be a binary form of degree n, without 
multiple linear factors, whose coefficients belong to an algebraic 
number field Kq of degree h. Let x and y he integral variables of Kq. 
Suppose that 

n > 2h. 


Let V (x, y) be any polynomial of total degree v < n ^ 2h which has 
coeffixdenis in Kq and has no common factor with Uix, y). Then the 
equation 

U{x,y) = V{x,y) (62) 


has only finitely many solutions. 


Proof: Just as in the representation theory for binary quadratic 
forms, it makes no difference whether we consider (62) or an equation 
obtained from it by a substitution x = ax' + by', y = cx -\- dy , 
where a, b, c, d are in Z, and \ad — 6c| = 1. If U(x, y) = a^x” + ■ ■ • 
+ a„y", then 

U(x, ax y) = t/(l, a)x” + * • • + On!/", 

U{x 4- fey, y) = oox" + • ■ • + U{b, 1)^”. 

Choose a in Kq so that C/(l,a)?^0, and put U(x,ax-\-y) = 
Ui{x,y). Then choose fe in Kq so that I7i{fe, 1) 5^0, and put 
Ui(x-\-by,y) = U 2 {x,y). Dropping the subscript, we see that 
there is no loss in generality in supposing that the coefficients of 
x” and y" in U(x, y) are different from zero, and we can write 

U (x, y) = ory” 



(63) 



4-8] APPLICATIONS TO DIOPHANTINE EQUATIONS 153 

where neither a uor any is zero. By assumption, the numbers 
are distinct, so if we put 

Cl = min ( 1 ^> - 

then Cl > 0 , and for every x and y, at least n — 1 of the factors in the 
product occurring in (63) have absolute values not less than 5 C 1 . 

Let X = »? and y = f 0 be integers of A"©, with field conjugates 
77 '**, . . . , . . - , Then as we saw in Theorem 2-5, 

U = (QiOY, 

>-i 

where Q(/) is an irreducible polynomial with coefficients in Z, and 
I <f < h. Let il/ = max (fFl, r?).and name the conjugates so that 

h// 

Q{t) = n 

Then the coefficients of Q(0 are numerically smaller then the cor¬ 
responding coefficients of 

n + M), 

so that llQll < {2MY>^. Aforli&ri, H{v/C) < 

Now by Theorem 4-15, there are only finitely many solutions of 
the inequality 




< 


1 


for fixed > 0. Hence for M sufficiently large, and < = i'h, 




1 


1 


(2A/)''(2+«')// - (2M)2*+‘ ’ 


at least if the left side is not zero. This is certainly true of the solu¬ 
tions of (62), since U{x,y) and V{x,y) have no common factor. 
The same argument applies to the numbers and ft, for 

1 < i < 1 < fc < n; we see that for < > 0 and M sufficiently 

large, the inequality 

-O) 


I* — 


(» 


1 


(2A/) 


2A+« ’ 


] l,...,/i, fc“l,...,n. 


holds for every solution of (62). There is no loss in generality in sup- 



154 


THE THUE-SIEGEL-ROTH THEOREM (CHAP. 4 

posing that M = ii-a)| = [fj, since (62) remains correct after replac¬ 
ing all quantities by their conjugates and if necessary interchanging 
X and y. Hence, for large M, 

On the other hand, there is a constant Ca, depending only on the co¬ 
efficients of V, such that 

Dl < C2M\ 

If we choose € < n — 2h — v, then for sufficiently large M, 

Wiv, r)i > \v{-n, ni. 

But a bound on M implies a bound on the integral coefficients of the 
polynomials defining t; and so that there are only finitely many 
solutions of (62). 

Corollary. If U{x, y) is a binary form of degree n > 2, with 
coefficients in Z and without repeated linear factors, and if a ^ 0 is a 
rational integer, there are ordy finitely many rational integral solutions 
of the equation U{x, y) — a. In particular, the equation 

ax^ + hy^ = c 

has only finitely many solutions in Z if a, b, and c are in Z, abc ^ 0, 
and n > 3. 

This follows immediately from the theorem, with Kq = R, h = I, 
and n — 2h > 0. The special case mentioned includes the higher- 
degree analog of Pell’s equation, x” — dy” = N. 

In the above considerations, strong use was made of the homogeneity 
of U(x, y). If a Diophantine equation is not of the form specified in 
Theorem 4-16, it may still be possible to relate its solvability to that 
of one of this form. We now consider such a case. 

4-9 A special equation. It was conjectured by E. Catalan in 1842 
that 8 and 9 are the only two consecutive integers larger than 1 which 
are powers of other integers. This has never been proved; it has not 
even been shown that no three consecutive integers are powers, 
although it is trivial that no four can be, since one must be of the 



A SPECIAL EQUATION 


155 


4-9) 


form -ik + 2. In slightly different terms, the problem is to show that 
the Diophantine equation 

-y‘ = I (64) 


has no solutions with w and z larger than 1, except for that men¬ 
tioned. Various special cases arise by fixing, or specializing in some 
other way, one or more of the variables in (64). The case we are now 
going to e.xamine is that in which the exponents are fixed, so that we 
consider the equation 

x*" _ y" = 1. (65) 


Catalan’s conjecture would be proved if it could be shown that for 
each pair of integers m and n larger than 1, (65) has no positive solu¬ 
tions except that mentioned. Since this seems to be unfeasible, we 
consider the more modest question of whether (65) can have infinitely 
many solutions. This, at last, is a question that can be answered. 
It is a very weak consequence of the following theorem, due to 
Mahler, that (65) has only finitely many solutions if m > 2, n > 3. 


Theorem 4-17. Suppose that m > 2, n > Z, ab ^ 0, {x, y) = 1, 
Then as max (|ar|, |yl) —» *, the greatest prime factor of 


ax”* by” 

tends to infinity. 

Since x^ — y^ = I has only the obvious solutions x = ±1, y =0, 
the new problem is completely solved. Unfortunately Mahler’s 
proof, which depends on a p-adic version of the Thue-Siegel theorem, 
cannot be included here. We can, however, obtain partial results of 
some interest. 

If mn is even, the fact that (65) has only finitely many solutions 
is a consequence of the next theorem, which is a special case of a 
theorem proved anonymously and published by L. J. Mordell. 


Theorem 4-18. Let f(x) be a polynomial of degree n > 3, mth 
coefficients in Z and mth distinct zeros, and let a be any nonzero 
rational integer. Then the equation 

= fix) 

has only finitely many solutions x, y in Z. 

Proof: Suppose that 


fix) = ao(x - $i) • • • (x - {„), 


(66) 



156 


THE THUE-SIEGEL-ROTH THEOREM (CHAP. 4 

and that (66) has infinitely many solutions. The numbers aj = oofj, 
for j = 1, ... ,n, are algebraic integers, and if (66) holds, then 

OOq^ “ Ol) • • ' (oo^ — Q!„). 

Let K = , In) be the spUtting field of /. Any ideal in K 

dividing [oox — a,] and [cqx — Oj] also divides [a,- — a>], so that the 
norm of such a common divisor is a divisor of the discriminant d off. 
Hence, if P is a prime ideal divisor of y and NP > d, then for some i, 
P |[aox — a,]. Since there are only finitely many ideals wth norms 
smaller than d, and only finitely many divisors of ao*~^a, it follows 
that for each i, 

[oox - a,-] = (67) 

where P, and C, are ideals, and B, runs over a finite set of ideals. 

Let D run over a fixed system of representatives of the various ideal 
classes in K; the number of D’s is finite. Then for each i and some 
D, Ci D, so that 

{0]Ci = m, 


for some /3 and 5. We shall show that /3 can be chosen from a finite set 
of integers of K. Let ([(3], [ 6 ]) = E, and put [)3] = EF, [5] = EG. 
Then EFCi = EDO, whence PC, = DG; thus F\D, and P is one of a 
finite set of ideals. By Theorem 3-2, there is an H with norm less 
than c (so that H is one of a finite set) such that PB = [ 7 ] is principal. 
Thus 


[y]Ci = {GH)D. 


Since C, Z), also [ 7 ] ~ GH; hence GH = [f,], and 

[y]Ci = 


where 7 is one of a finite set of integers. 

By (67), , , 

[y^][a(^ — a.] = ]D , 


from which it follows that B.B^ is principal, say B,Z)^ = [j?,]. Thus 
for some unit 

7,-^(ao^ — «i) = 

By Dirichlet’s theorem on units, e, can be written as where 

«/ is one of a finite number of units. Finally, for f = 1,. . . , n, 

QqX — cti = x,A,- , 



A SPECIAL EQUATION 


4-9) 


where Xi, .. . , are integers of K, and xi ,. . 
of finitely many numbers of K. Hence 


157 

Xn are certain ones 



X 2 X 2 ^ — a 2 ~ <*1 ^ 0 , 


< 2 X 2 ^ — ^ 3 X 3 ^ — 03 — 012 ^ 

^3X3^ — xiXi^ = ai — a3 ^ 0. 

Now let L = V^, V^). Then, in L, 

(XiVxi — X2V^) (Xi n/xi + X2V^) = 02 — oi, 

and since the denominators of xj and X 2 can be taken to be bounded, 
it follows that 

XiV^ — X 2 V^ = ^3<3*. 


where /Sa is one of finitely many elements of L, <3 is a unit of L, and 
I > 1 is an arbitrary positive integer. Similarly, 

Xa's/xa “ Xs^n/xs = ^1*1^ 


But then 




If there were only fimtely many distinct ratios ei/ea, there would be a 
finite set of coefficients such that 

Vaox — 02 — Voox —03 = ifi{y/aoX — oi — y/a^x — 02) 

for every solution x of ( 66 ) and for suitable determination of the 
radicals. This is clearly impossible, so ( 68 ) must have infinitely 
many solutions in integers <i/< 3 , € 2 A 3 of L. But for I sufficiently 
large, this is in contradiction with Theorem 4-16. Hence the sup¬ 
position that ( 66 ) has infinitely many solutions is not tenable, and the 
proof is complete. 

Returning to equation (65), we see that the only possible solutions 
have X = 0 or ± 1 , if (m, n) > 1. For the problem that remains, it 
suffices to consider the case in which m = p and n = 5 are distinct 
odd primes. This was treated by M. Newman, whose work was not 
published. A slightly strengthened version of his result, obtained by 



158 


THE THUE-SIEGEL-ROTH THEOREM 


(chap. 4 


applying Theorem 4 16 rather than the analogous consequence of the 
Thue-Siegel theorem, follows. 


Theorem 4-19. If p and q are distinct odd primes such that 
q > 2{p - 1) and q does not divide the class number of the cyclotomic 
field Kp ~ R{^), where f = exp{27ri/p), then the equations 

xP = itl (09) 

have only finitely many solutions x, y in Z. 


Proof: We carry out the proof only for the equation x^ - y^ = 1; 
the alternate case requires only trivial modifications. Put 1 — f = t 
and [tt] = P, so that P is a prime ideal of Kp, by Theorem 3-6. Let h 
be the class number of Kp. 

If X and y satisfy (69) with the plus sign, then 

[x - l][x - f] . . . [x - fP-J] = [yy. (70) 

Put 

Dr, = [x — ffor 0 < r < p — 1, 

0 <s<p — 1, r 9^ s. 

Then 


Dr, = (:»: - r, - r] = [x - 1 +1 - r - ri 





T, T 


== [x - 1, tt], 

since (f* — f0/(l — J") is a unit if p\{k — 1). Thus Z)„ is the same 
for all r and s, and, since Dr,\P and P is prime, either D,, = [1] or 
Dr, = P. We consider the two cases separately. 

If Dr, = [IJ, then the ideals [x — are pairwise relatively prime; 
since their product is a ^th power, there are ideals Aq, ..., Ap^i 
such that 

[x-r] = ^r^ r = 0,...,p-l. (71) 

Suppose that is the smallest positive integer such that Afi' is princi¬ 
pal; by Theorem 3-4, er\h, and by (71), Crl?- But q is prime and q}h, 
so Cr = 1 and Ar is principal. Hence there are integers a and /3 and 



159 


4-91 


A SPECIAL EQUATION 


units e and t! of iCp such that i — 1 — ea* and x 

— ta^ = Jr. 


f whence 



By Theorem 2-45, the units of Kp have a finite basis, so that each 
unit has a representation ci • < 2^1 "'here ci is one of the finite number 
of units obtained by taking products of powers of the basis elements, 
with exponents non-negative and smaller than q. Thus (72) implies 
that one of the finitely many equations 

“ <i(€2ar)® = ^ (73) 


must hold. But for each choice of «i and </, (73) has only finitely 
many integral solutions < 2 «» ^ 2 '^ in TCpi this is evident from 1 heorem 
4-16 with Ko = Kp, h == p - 1, n = q > 2(p - 1), V = 0. Hence 
X, and therefore also y, has only finitely many possible values. 

The proof for the case Dn = P proceeds similarly. We put 
X — \ = mv and y = ir”*2, w’here w and 2 are integers of A'p \vith 
[t, z] = [1]. Then (70) becomes 






and since the ideals on the left are pairwise relatively prime, there is 
a t with 0 < f < p — 1 such that 



Thus there are ideals Aq, . . . , i4p_i such that 

[uj + = P^^-^Al 


for 0 < r < p — 1, r 9 ^ t. 


As before, it follows that all the ideals A, are principal (for r = i, use 
the fact that an ideal equivalent to a principal ideal is principal). 
Since p > 2, there are distinct rational integers r and s different from 
t such that 0<r<p — l,0<5 <p — 1. Then for integers a and 
P and units € and e' of Kp, 






w + = tV, 



160 

so that 


THE THUE-SIEGEL-ROTH THEOREM 


(chap. 4 


JM <1 f f 

and the expression on the right is not zero. The earlier reasoning 
shows that the theorem is also true in this case. 

PROBLEMS 

1. Extend Theorem 4—18 to the case that/may have multiple zeros, but 
has at least three distinct zeros of odd orders. 

2. Deduce from the finiteness of the number of solutions of (66) that as 
the integral variable x tends to infinity, the greatest prime divisor of/(a:) 
does also. [Hint: Assume that for infinitely many x, f{x) is a product of 
powers of a fixed finite set of primes, and obtain a contradiction.) 

REFERENCES 

Section 4-1 

See the following papers: J. Liouville, Journal des Mathimatiques Puree 
et AppliqiUes (Paris) 16, 133-142 (1851); A. Thue, Journal fur die Heine 
und Angewandle Malhematik (Berlin) 136, 284-305 (1909); C. L. Siegel, 
Mathemaliscke Zeitschrift (Berlin) 10, 173-213 (1921); F. J. Dyson, Acta 
Mathematica (Stockholm) 79, 225-240 (1947); T. Schneider, Archiv der 
Mathematik (Karlsruhe) 1, 288-295 (1948-1949); K. F. Roth, Mathematika 
(London) 2, 1-20 (1955); Corrigendum, Mathematika 2, 168 (1955). 

The paper by Siegel contains many variants and applications of the 
Thue-Siegel theorem. 

Section 4-7 

The literature concerning Catalan’s conjecture is reviewed by R. ObUth, 
Revista Matemdlica Hispano-Americana (Madrid) 1, 122-140 (1941). 
Mahler’s theorem appeared in Nieuw Archie/ voor Wiskunde (Amsterdam) 
1, 113-122 (1953). Theorem 4-17 appeared in Journal of the London 
McUheynatical Society 2, 66-68 (1926). 



CHAPTER 5 


IRRATIONALITY AND TRANSCENDENCE 

5-1 Irrational numbers. One of the oldest results in the theory 

of numbers is that V2 is irrational; this was known to the Pythago¬ 
reans in the fifth century b.c. The proof, when suitably generalized 
with the help of the Unique Factorization Theorem, leads to the well- 
known rule for determining the possible rational zeros of a polynomial 
with rational integral coefficients; this in turn makes it possible to 
show, if such is the case, that a given polynomial has only irrational 
zeros. Thus the numbers given implicitly as zeros of polynomials can 
be trivially classified as rational or irrational. 

If a number is given by its decimal expansion, one has only to 
determine whether its digits eventually recur periodically to know 
whether or not it is irrational. For example, the number 

0.1234567891011 . . . , 

whose successive digits are formed in an obvious fashion, is clearly 
irrational, since arbitrarily long blocks of a single digit occur, pre¬ 
cluding periodicity. Similarly, using the regular continued fraction 
expansion of a real number, one can identify not only the rational 
numbers but also the quadratic irrationalities. (Unfortunately, 
there is no simple algorithm known which singles out the algebraic 
numbers of fixed degree n > 3 in a distinctive way.) 

If a real number x is not given in one of these convenient forms, the 
problem of deciding whether or not it is rational may be decidedly 

nontrivial. It is, for example, not known whether Euler’s constant, 
defined as 

IS rational. Aside from properties of special algorithms, the only 
method available for investigating such questions depends on the 
following observation. If x = a/b is rational, then for every pair of 
integers p and q, the number g'x - p is some integral multiple of 1/6, 

161 



162 


IRRATIONALITY AND TRANSCENDENCE 


— —--- w*-/ (chap. 5 

so that it is impossible to find an infinite sequence of pairs and qn 
such that 

\q\^ - Pil > \q2X - P 2 I > \qzx - psi > ■ • •. ( 1 ) 

More generally, no such sequence can be found for which 

|?„x — pn| 7 *^ 0 for every n, and lim |g„x — pn| = 0 . ( 2 ) 




On the other hand, when x is irrational there are infinitely many solu¬ 
tions of the inequality 


0 < |gi — p| < - 

Q 


We therefore have 


Theorem 5-1. Each of the following is a necessary and sufficient 
condition for the irrationality of a real number x: 

(a) there are integers pi, gi, P 2 , ? 2 » • • ■ > that the inequalities (1) 
hold) 

(b) there are integers pi, qu P 2 > 92 » • • • > such that the conditions (2) 
hold. 

As a simple application of this principle, we prove 
Theorem 5-2, The number e is irrational. 

Proof: We recall the expansion 

e 11^2! ^ n! ^ 

It is weU known that if qq, ai, • ■ ■ is an unbounded increasing sequence 
of positive numbers, then the series 

£(^ (3) 

*-0 

converges to its sum S in such a way that 

n (-1)* 


0 < 


5- L 


< 


1 


*-o flfc 

for n > 0. Hence if we put = n! and 

, " (-1)* 
it! 


On+l 



5-1] 


IRRATIONAL NUMBERS 


then pn and Qn are integers and 


0 < 


1 

= n! 

1 " (-i)‘ 

• 

1 

1 

3 

. ^ II 

1 « 


e A •°o 


< 


71! 


t 


r.i -u m 


163 


1 

71+1 


It follows that 1/e, and hence e itself, is irrational. (This is a variant 
of the original proof due to Fourier.) More generally, the same 
argument shows that if the lcm of the integers oi, . . . , On 5s o(a„+i) 
as 7i ^ 00 , then the series (3) converges to an irrational number. 

For completeness, we give a proof due to I. Niven that tt is irra¬ 
tional. It is short and simple to follow, but to one unfamiliar with 
older work it must appear completely unmotivated. 


Theorem 5-3. The number tt is irrational. 


Proof: Suppose on the contrary that ir — a/6, where a and 6 are 
integers. Put 

x^(a - bx)’^ 


fix) = 


n! 


and 


Fix) =/(x) -f"ix) +/^'^>(x)-+ 


where the positive integer n will be specified later. 
f'(0) = • • • = = 0, and if we write 


Now /(O) = 


fix) = 


Oox" + + • • • + a„x 


2n 


n! 


we see that for ti < it < 27i, 


1 


n 


f^^^ix) = — L (71 + l)in + ; - 1) . . . (n + i - fc + l)a<x'‘+^~* 




(n + Q! 
n\ {To in I - k)\ 


aix 


n+t—k 


so that 


/“’(O) = --k<a 
n! 




H^ce (0) c Z, and since fix) = /(jr - x), also (jr) g Z, for 
0 S J < 2n, Finally, F{0) and F(7r) must be integers. 



164 


IRRATIONALITY AND TRANSCENDENCE 


[chap. 5 


On the other hand, 

— iF'{x) sin X — F{x) cos x) = F”{x) sin x + F{x) sin x 

= /(a:) sin x, 

so that 


/(x) sin xdx = [/^'(x) sin x — F(x) cos x]; = F{v) + /f’(O). 


But for 0 < X < IT, 

0 < Si.^) sin X < —— > 

n! 



so that the above integral is positive but arbitrarily small for n 
sufficiently large. But this is impossible, since F(0) + F{v) is an 
integer. The contradiction establishes the theorem. 


PROBLEM 

Given a real number x, define the sequence {x*} of real numbers and the 
sequence {a*;} of integers by the conditions 

[xl = ao, xi = X - [xl, 


Xi = — + X2, 

ai 
1 , 

X2 --h X3, 

02 


where — < xi < —- « 

01 Oi — 1 

where — < X 2 <- r ' 

02 02 — 1 


1 , 

X* -- r Xk+h 

ak 


where — < x* < -- » 

Ojfc Oi — 1 


Thus 

X ^ ao-\ -1-1-• 

Ol 02 

Show that this expansion terminates if and only if i is rational. Show also 
that if X has an infinite series expansion 


x^ho + r + r"*-• 

bl 02 

where the numbers bt are integers with > h**, then bk 
and X is irrational. 


o* for all k, 



6-2) THE EXISTENCE OP TRANSCENDENTAL NUMBERS 165 

5-2 The existence of transcendental numbers. One class of 
irrationals, the algebraic numbers, has been treated in some detail in 
the preceding chapters. We now consider the complementary set of 
transcendental numbers: those complex numbers which do not satisfy 
any rational algebraic equation with coefficients in Z. It is by no 
means obvious that this set is nonvacuous; the first proof, given by 
Liouville in 1844, depends on the fact (see Theorem 4-1) that if or 
is algebraic of degree n > 2, then there is a constant C such that 
the inequality 



has no solution p, q in Z. If a number { can be found such that for 
every w > 0 the inequality 

0 < |?{ - p| < ^ . j > 1, (4) 

has a solution, then { cannot be algebraic of any degree, and must 
therefore be transcendental. 

An example of a Liouville number, for which (4) always has a 
solution, is given by 

I = E (-l)*a-^‘. 

fc-i 


where a > 1 is a fixed integer and hi, 62 ,... is an increasing sequence 
of positive integers such that 


lim sup 


bk 


00. 


For, given <a, there is an n = n(a») for which 6 „+i/ 6 „ > « -j- 1, and if 
we put 

then p and q are integers, and 


0 <\q^ - p\ <q. 


1 


1 


a 






1 


It should be emphasized that the condition (4), while sufficient 
for transcendence, is by no means necessary, even for real numbers. 



166 


IRRATIONALITY AND TRANSCENDENCE 


[chap. 5 


For, using a modification of an argument due to Cantor, we can give 
a second proof of the existence of transcendental numbers, and in 
particular of numbers of this kind for which the inequality 



has only finitely many solutions for fixed e > 0. It is known* that 
there are uncountably many irrational numbers $ for which M (^) = 3, 
where M (|) is the supremum of the numbers X for which the inequality 




has infinitely many solutions. Hence if the algebraic numbers are 
countable, it follows that there are nonalgebraic numbers for which 
M{^) =3. 

To order the algebraic numbers, we associate with each non¬ 
constant polynomial P{x) = + • ■ ■ + fln "'ith integral coeffi¬ 

cients the number h{P) = n + |ao[ + • • • + l^nl- There are no 
polynomials with h{P) = 1. If ^(F) = 2, then P(x) = a: or — x. If 
h{P) = 3, then P{x) is one of ±x =b 1, zt2x, all combinations 
of signs being allowed. In general, it is clear that if A: > 2, there 
are only finitely many polynomials such that h(P) = k. Hence all 
polynomials with integral coefficients can be arranged in a sequence; 
first those with A(P) = 2, in some order, then those with k{P) = 3, 
in some order, etc. Suppose that Pi (x), P 2 (x), ... is such a sequence. 
Each Pjt(x) has finitely many zeros; write down all the zeros of 
Pi{x) in some order, then all those of Pzix) in some order, etc. Let 
this sequence be ffi, 02 , ■ ■ ■■ Now if & = 0u delete 02 ', if da = 02 
or 01 , delete ^ 3 ; and in general, if 0 k is equal to some 0 with smaller 
subscript, delete 0k. Then the resulting sequence ai, as, • • • con¬ 
tains all algebraic numbers, each just once. 

To summarize, if a number can be approximated sufficiently well 
by rational numbers, it is transcendental, but there are transcendental 
numbers which cannot be approximated even as well as some quad¬ 
ratic irrationalities. 

• See, for example, Volume I, Theorem 9-12. 



5-3) 


A CRITERION FOR TRANSCENDENCE 


167 


PROBLEMS 

1. Show thftt ^ is a Liouville number if the partial quotients in its coi\- 
tinued fraction expansion, 

{ = oo H-- ■ ■ ■ , 

ai + 

have the property that 

.. log at+, 

lira SUD-:-TT = 

*_,» log ((ai + 1) • • • (dt + 1)) 

{//ini: Show from the recursion relation for the successive convergents 
that gt < (di -f 1) • • • (a* + 1), and then use Theorem 2-6.) 

2. Investigate the implications of Theorem 4-15 as regards transcen¬ 
dental numbers. 

5-3 A criterion for transcendence. In order to obtain an approxi- 
mability condition which is equivalent to transcendence, we must 
replace the linear expresion g( — p occurring in the inequality (4) 
by a polynomial in 

Theorem 5-4. A real or complex number | is transcendental if and 
only if there corresponds to each w > 0 a positive integer n, such that 
the inequality 

0 < Ixo + xii + ■ ■ ■ + Xnri < (5) 

has infinitely many integral solutions xq, ... ,Xn, where 

X = max (|xol,..., lx„l). 

It is to be noticed that the Liouville numbers (those for which (4) 
has a solution for each w) are precisely the numbers for which we 
can take n = 1 for every u. In general, however, n increases with «. 

Proof: We first prove that the condition is sufficient. Let a = ai 
be algebraic of degree g, let /(x) = oq + djx + • ■ - + be that 
multiple of its defining pol 3 momial which has relatively prime coeffi¬ 
cients in Z, Tivith Og > 0, and let ai,... , be its conjugates. Let 
h{x) = xo + xix + • • • -f- x„x” (x„ > 0) be any polynomial with 
integral coefficients, and wth zeros . . ., /3„ distinct from 
ai,..., Og. Then 


A n m) 

^0 i-l 

1 

j 

n n - «;) 

— 

Q n 

n n (ffi - 

>-i i-i 

aj) 




= 

^ n/i(aj) 

/-I 

f 



168 


IRRATIONALITY AND TRANSCENDENCE 


(chap. 5 


SO that if X = max (|xol,... , |a:„[), then 


0 < |^(«)| = 


n 


n m) 

1-1 


n 


n m) 


a," n a,” n 

>-2 ,= 2 \ X / 


( 6 ) 


But 


n f{0i) 

1=1 


is a symmetric polynomial with integral coefficients in the /S’s, of 
degree g in each 0, and is therefore, by the Symmetric Function 
Theorem, a polynomial of total degree g, with integral coefficients, 
in the elementary symmetric functions x„_i/x„, —x„_ 2 /x„,..., 
ztXo/Xn- Hence the numerator in the expression (6) is a positive 
integer, and we have 

IM«)1 >-;r ^ 


Co” n 


y»2 


hiaj) 

X 




Now if r = f^, then 


h{ai) 

X 


< 1 + l«;l + Wi? + • • • + lay]" < 1 + r + r" + • • • + r", 


so that the quantity 


a/fi 

y-2 


Haj) 

X 


has a positive lower bound A («. a) depending only on a and n. 

.. Mn, oc) 

^ ^9-1 ’ 


Thus 


(7) 


It follows that if (5) has infinitely many solutions with fixed n, 
^ cannot be algebraic of degree less than w + 1. Since w can be arbi¬ 
trarily large, ^ cannot be algebraic. 

The necessity of the condition of Theorem 5-4 is a consequence 

of the following more general theorem. 



5-3] 


A CRITERION FOR TRANSCENDENCE 169 

Theorem 5-5. If t?i, . . . are complex numbers, then for a 
suitable c which depends only on n and , t?„, the inequality 

ko + -Tit?! + ■ ■ • + Xn<9n\ < (8) 

has infinitely many integral solutions Xq, ... , 

If ^ is transcendental, we can take d* = since no polynomial 
in f vanishes, it follows that (5) has infinitely many solutions if 
n = [2ci} + 2]. 

Proof: The theorem is trivial if n = 1. For n > 1, put 

c' = c'(di,... , i?„) = 1 + |di| + • * • + |d„|, 

let > 2 be a positive integer, and let Xo^ x /,..., x/ range inde¬ 
pendently over the integers from —h to h inclusive. Since each of 
the n + 1 numbers Xk can assume any of 2A -f- 1 values, there are 
(2h + = t expressions 

, (?n) = Xo^ + + • ■ ■ + x„'d„, \xk'\ < h. 

Let these be, in some order, Li,..., Lj. Clearly 

<c% 


so that all the points L,(di,... ,t>„) lie in the square of side 2c'h 
with its center at the origin of the complex plane. Subdivide this 
square into subsquares of side 2c'hfm each; then if < t, 
there must be at least one subsquare containing more than one point 
L(d|,...»t?n). We can fulfill the condition < thy taking 

m = [(2A -I- _ 1. 

For this m, suppose that the points 

, l?n) = Xo + Xi^i?i -f- ■ * • Xn'^n 

and L 2 { 0 i, . . . , = Xq' + x/l?! H-1- Xn'Hn 

He in a common subsquare; the distance between them does 
not exceed the length of the diagonal of the subsquare, which is 

“.r “ *“'■ • • ■ . {so 

that X < k ~ (“A) = 2h), and 

, i?„) — L2(t?i, . . . , t?„) 

= 2^0 + Xit>i + •••-{- x„t>n, 



170 

then 


IRRATIONALITY AND TRANSCENDENCE 


(chap. 6 





2 V2 c'h 


^c'h 


[{2h + - 1 - 


2c' 


(2A)i("-i) 





Hence (8) has at least one solution, with c = 2c'. 

If L{Oi, ..., i>„) = 0, then xL(t?i,... ,i?„) = L(xt?i,..., xt?„)=0 
for every integer x, and (8) has infinitely many solutions. In the 
contrary case, choose hi so large that 




2c' 


and repeat the entire argument with h replaced by hi. Calling the 
new form thus produced we have, by the analog of (9) and the 
definition of hi, that 

.t>n)l < 

SO that we have a second solution of (7). Continuing the process, 
we can obtain arbitrarily many solutions. 


PROBLEM 

Show that if the numbers i? i?™ are real, then Theorem 4-5 remains 
correct if the inequality (8) is replaced by 

\Xq 4- Xlt?l + ■ ■ * + ^nt?ni < * 

5-4 Measure of transcendence. Mahler’s classification. In light 
of Theorem 5-4, we make the following definition; a function <p{n, 0 
is called a transcendence measure for the transcendental number f if 
for each n there is a constant Cn such that for every X > 1, 

|xo 4" + * * ■ "h 3:n€”t > Cntp{n, X) 

for each set of integers Xo, • • • > of height X = max 
By Theorem 5-5, any such ip{n,i) is no larger than r’ . A 
theorem giving a measure of transcendence of a number f represents 
a refinement of the assertion that ^ is transcendental; such measures 



Mahler’s classification 


171 


5-4) 


have been given for certain numbers. In Section 5-5 we shall deter¬ 
mine a measure of transcendence for e. 

Mahler has elaborated on the theory of transcendence measure in 
the following way. Let z be a complex number, and put 




= min 








( 10 ) 


where the minimum is extended over all those sets of rational integral 
coefficients Jq. • • • . x„ of heights at most X for which 


n 


E 

k^O 


XlcZ^ ^ 0 . 


Then wnCA”) is at most 1, and is a nonincreasing function of both X 
and n. Put 


so that 


and let 




P„(A') = 


log (l cj„(A’')) 
log A" 


o)n(z) = wn = lim sup p„(A’'), 

x-»» 


( 11 ) 


w(2) 


!• 

w = hm sup- 

n—♦ m Ti 


Each of Wn and <«> is either -f- <» or a non-negative number. If Wn is 
infinite and n > «, then a>„/ is also infinite; hence there is an index 
m( 2 ) = ti, which may be finite or infinite, such that is finite for 
n < n and infinite for n > n. The two quantities tj, ^ are never 
finite simultaneously, for the finiteness of m implies that there is an 
n < <x> such that , w’hence « = <». The number z is called 


an A-number, 
an S-number, 
a T-number, 
a f7-number, 


if W = 0, ^ = 00 , 

ifO<<i)< 00, = 00, 

if oj = 00, ^ = 00 ^ 

if ci> = CO, n <. oo. 


If M is finite, then there is a fixed integer n such that for every 
<r > 0 there are integers Jq. • ■ . , x„ such that 


|xo + XiZ -b • • • -b Xn^”! < A’‘“'^. 



172 


IRRATIONALITY AND TRANSCENDENCE 


[chap. 5 


For the case n - 1, this is exactly the definition of the Liouville 
numbers, so that the f/-numbers may be regarded as higher degree 
analogs of Liouville numbers. The author has shown that there are 
^/-numbers of every degree. 

If 2 is algebraic, the inequality (7) shows that p„(X), and hence 
also a)„, remains bounded as n co, so that w = 0 and 2 is an 
/l-number. If, on the other hand, 2 is transcendental, it follows from 
Theorem 5-5 that > ^(n - 1), whence w > Thus the A- 
numbers are precisely the algebraic numbers. 

The existence of T-numbers has never been proved. 


Theorem 5-6. If the complex numbers 2 and w are algebraically 
dependent, that is, if there is a polynomial F{x, y) with coefficients in 
Z such that F ( 2 , w) = 0, then they belong to the same class. 


Proof: If 2 is algebraic and w is algebraically dependent on 2 , 
then w is clearly also algebraic. We may therefore suppose that 2 
and w are transcendental. 

Let F{x, y) = L L ahkx^y^, 

A«0 k=0 


and suppose that F is irreducible. (One consequence of this assump¬ 
tion is that no polynomial in x alone is a factor of F.) Write 

P(x,y) = L Akiy)3^, 

A-O 

If 

where Ah(y) = T. <^hky^- 

kmO 

We may suppose that Aj^ (y) is not identically zero. 

Let A (x) = uo H- + be a polynomial for which the 

minimum is achieved in the definition (10) of z), so that in 

particular max (|ajt|) < X. We shall obtain inequalities relating 
wiz) and u.(tp); since in the definition of these quantities the first 
limit is taken on X, we temporarily regard n as fixed and X as a 
parameter. 

Since it is not the case that for each fixed y the polynomials F(x, y) 
and A (x) have a common zero, we know by a standard theorem* that 

♦ SeeTfor example, B. L. van der Waerden, Modem Algebra (English 
edition, translated by Fred Blum from the second revised German edition). 
New York: Frederick Ungar Publishing Co., 1949, Vol. 1, pp. 83-85. 



5-4] 

the resultant 


Mahler’s classification 


173 


Qq ... ... ... On 0 ... 0 

0 Qq ... ... ... On ... 

•M rows 

• • 

• * 

R (y) ~ 0 • • . 0 do •« • .. • • * • Gfi 

Aoiy) . Aj^iy) 0 

« • 

■n rows 

0 Ao(i/). Aifiy) _ 

is not identically zero. R(y) is a polynomial in y of degree nN at 
most, with coefficients in Z. Since F is a fixed polynomial through¬ 
out, the coefficients in R(y) do not exceed CiX^^, where cj is a con¬ 
stant depending only on n and F. 

If for each I with 2 < I < M n, the 1th column in the determi¬ 
nant for F(y) is multiplied by and added to the first column, the 

new first column is 


A{x), x.4Cx), . . . , x^^ M(x), F(x, y), xF{x, y), .... x""^F(i, y). 
Expanding by minors of the new first column, we obtain an identity 

R{y) = A{x)g{x, y) -f F(x, y)h{x, y), 

from which 

R{w) = A{z)g{z, uj). 

Regarding g{x, y) as a sum of minors, we see that its coefficients are 
rational integers not exceeding C 2 X^~^ in absolute value, so that 


Hence 

But 


|ff(z, uj)l < 

\A{z)\ > C3"^X-"+»|i2(u,)l. 

|ft(ta)| > UinN{C\X^, w), 


so 


\A(i)\ > w). 



174 


IRRATIONALITY AND TRANSCENDENCE 


(chap. 5 


It follows from the definition of A{x) that 

and so we obtain 

, , log (lMX,z)) 

= hnyup- <M -1 + Mw„n(w), 


o)(z) = lim sup 


«n(2) 


n 


... (M - 1)N + MNunNiw) ,, 

< hm sup-—-< MN<i}(w), 

n Tliy 


and 


< Nn{z). 


By symmetry, 

a)(u;) < MN{a{z) and n{z) < Mn{w). 

Thus ia{z) and w(uj) are simultaneously finite or infinite, as are fi{z) 
and hence z and w are in the same class. 


5^5 Arithmetic properties of the exponential function. In this 
section we shall prove a theorem due to Mahler which simultaneously 
shows that e is an iS-number (and therefore transcendental), gives a 
transcendence measure for c, and shows that tt is transcendental. 
The transcendence measure is not the most precise one known, but 
more exact results are more difficult to prove. 

We begin with an algebraic analog of Theorem 5-5. Let wi,..., 
be distinct complex numbers (having no connection with the function 
cd„( 2 ) of the preceding section), and letri,.... be positive integers. 
Instead of asking for rational integers tq, ..., Xm for which the 
quantity Xq + H-(- is numerically small, we shall 

investigate the polynomials 

i4jfe(2) = Ak{z\ ri,..., r„;, w„), k = . ,m, 

of respective degrees ri — 1, .. •, Tm “ 1 s-t most, for which the 
function 

R{z) = R(z; ri,..., r„; wi,... , w„) 

= Ai(2)e"‘' + • ■ • + AMe^”^ (12) 

is algebraically small, i.e., has a Maclaurin expansion beginning with 
a. large power of z. The total number of coefficients among the 



THE EXPONENTIAL FUNCTION 


5-5) 


175 


polynomials Ak{.z) is r = ri + • • • + r„; if they are taken as un¬ 
determined constants, then the conditions 

RiO) = 0, R'iO) = 0, . . . , = 0 


yield a system of r — 1 linear homogeneous equations in these 
r unknowns. Such a system always has solutions distinct from 
(0, 0,, 0). Let R(z) temporarily designate any of the functions 
obtained in this manner; thus R(z), which is not identically zero, 
certainly has a zero of order r — 1 at z = 0, and could conceivably 
have one of higher order there. Suppose that the actual order is 
r — 1 -f- so that R(z) has an expansion 

IB 

R{z) = L Qhz'', ar+B-i ^ 0. 

A-r+E-l 

The non-negative integer E is called the excess, and m is called the 
order, of R(z). We first show that the excess is always equal to zero. 

At least one of the polynomials Ajc(z) does not vanish identically, 
and with no loss in generality we may suppose it to be Ai{z). It is 
easily proved by induction that if Z) = d/dz, 

7)“e“M(z) = (13) 

for every positive integer a and every function A (z) with sufficiently 
many derivatives. Moreover, if A(z) is a polynomial which is not 
identically zero, and « ^ 0, then (D + a))M (z) is a polynomial of 
the same degree as A (z). Hence 

= •>' + . . . + + A„(z)) 

= -h . ■. -H 

— R(z; ri, . . . , r„_i ; wi — a3„.Wm—i — w„), 

where Ai* is not identically zero and, as implied by the notation, 
deg Ai* < r* - 1 for * = 1, . . . , m - 1. Clearly 

^ (0; ^1, ■ . . , r„_i; W, — to..,, . . . , to„,_, — cOm) = 0 

for p = 0, 1.. . ., r + £ - r... - 1 , so that from an i?-function of 
order m and excess E we have obtained another of order m - 1 and 

^- Repeating the process, we come finally to a function 
«i(z) =J?(z;r:;a,) of order 1 and excess E. But if 

conditions /?,(0) = - ■ ■ = 

V ) 0 give oo - ■ ■ • - 0^,-2 = 0, so that there is certainly 





IRRATIONALITY AND TRANSCENDENCE [cHAP. 5 

no such function which does not vanish identically, if E > 0. Hence 
^ 0, or equivalently, the coefficient of 2 '^* in the Maclaurin 
expansion of R(z) is not zero, while all preceding coefficients are zero. 
Introducing an appropriate numerical factor, we can put 


R(z) = 


(r- 1)! 


+ brZ' + 


The function R and the coefficients br, 6r+i,... are now uniquely 
determined, since if there were two such functions for given 
• • • , ri,... ,rm, their difference would have positive excess. 
Moreover, while we have so far known only that not all of the poly¬ 
nomials Ai{z), . . . , Am{z) are identically zero, we now see that in 
fact they are of exact degrees ri - 1, . . . , - 1, respectively, 

since otherwise we could have begun with lower degree polynomials 
and arrived at a function of positive excess. Finally, we see that 
R{z) is symmetric in the pairs of arguments 
since the pairs can be permuted while the solution (subject to all the 
imposed conditions) is unique. This can also be seen by noting that 
R{z) is the unique solution of the homogeneous linear differential 
equation 

{D■ {D - = 0 


for which/?(0) = ... = = Oand = I, and the 

factors in the differential operator may be permuted at will. 

We now obtain explicit expressions for R{z) and the Ak{z). Clearly 

1 




MU 


(14) 


(ri-1)! 

since this function has all the requisite properties and there is only 
one such function. Suppose that R{z\ ri,... , 1 ; wi,..., w^_i) 

has already been determined. Then if J is the operator 


J 


L 


we have by (13) that 

X {(z i ri,..., : 0)1,..., w^_i) I 

= {D~u>iY^ • ■ ■ (/>-o),,-!)’■**■’/?(z; . • ■ > ; wi, -.., w,,_i)=o, 


• • » 



THE EXPONENTIAL FUNCTION 


177 


5—51 IJlli 

and since (z; ri, . . . , r^_i; wi, . . . , w^—i) has a zero of order 
-j. . . . -j- — 1 at 0, the function 

e"'‘'V'‘(e“^*'*i?(z; ri, . . . , r^_i; wi, . • - , Wp-i)) 

has a zero of order ri + • • • + — 1 at 0, and it clearly has leading 

coefficient ((rj + • ■ ■ + — 1)!)“*. Hence 

H{z] ri, . . . , r^;wi, . . . , a>^) 

_ g«MV’’»‘(e~"*‘*K(z; ri, . .. , r^_i; wi, . . . , (15) 


and consequently 
/f(z) = (e“”'V-‘) 

We now use the standard formula 


^g(w2—«2)* 


(ri-l)! 


J«/(2) 


Jo (a - 


- 0*^^ 

1 )! 


m dt, 


which is easily verified by integration by parts. We have 


/ 2'^^“* \ 
«j)* rrj I .(wi—wj)« _ \ 

\ iri - 1)!/ 


^ f __ (Z .. 

jo (n - 1)! (r 2 - 1)! ' 


SS f - - - — _ Jj 

Jo (ri - 1)! (r 2 - 1)! 

and by induction we see that 




Riz)= j’du-, 


m—2 * • • 


J^ I (ri-l)!(r 2 -l)!---(r„_i-l)!(r^-l)! 


Before deducing an explicit formula for Ak(z), we recall certain 
properties of inverse operators. The operator D~\ as applied to an 
integral combination f(z) of polynomials and exponential functions, 
yields that antiderivative which contains no constant of integration. 



178 


IRRATIONALITY AND TRANSCENDENCE 


[chap. 5 


Hence 


ir^Kz) = J^Kz) + <p{z), 

where ^ is a function annihilated by Z)'», that is, a polynomial of 

degree p — 1 at most.. More generally, if w 0 we define, by analogy 
to (13), 

{D - oy)-^j{z) = e-2r^(c-“V(z)), 

so that 

{D - <.r^S{z) = + ^(z), (17) 

where ^(z) is annihilated by (Z) — w)'*; that is, it is e"* times a 
pol 5 momial of degree p — 1 at most. Since 


(Z)-«) 



nz"~^ 

03 


+ 


n(n — l)z” ^ 


(a 




and since no term of the operand is annihilated by Z) — w, we can 
write 








D 

1 + - + -2+ 
(t) 



More generally, it can be shown that if F is any polynomial of degree 
n for which F(0) 5^ 0, then (Z’(Z)))“'*z" can be written as 

(ao + OiZ) + • ■ ■ + a„Z)'*)z", (18) 


where Oq + ■ ' * "I" is the Maclaurin expansion of (Z’(u)) ^ to 
n + 1 terms.* 

We can now prove that for k = 1,..., wi, 


Ak{z) = 


m 


n (z) + Wife — wa) 

A-l 
hftk 






For m = 1, the empty product is interpreted as the identity operator, 
of course, and in this case the correctness of (19) follows from equa¬ 
tion (14). Suppose that it is correct for all polynomials 

Ak{z]T\y ..., r^i \ wi,..., w,,_i) 


*A more complete discussion of inverse operators is given in E. L. Ince, 
Ordinary Differential Equations, New York: Longmans, Green & Co., 
Inc 1926* reprinted by Dover Publications, New York, 1944; pp. 138-140. 



5-5] THE EXPONENTIAL FUNCTION 179 

with M — 1 pairs r*, o)k. Then, by (15) and (17), 

^> ^>1»wi» • • •» 

^—1 

= e‘-'‘V*‘ E Ak{z;ru..., ..., 

ik -I 

^—1 

ik«l 

X^lfc (2 , Ti, . . . , Tj,—I j Ui, . . . , Wfi—l) "l"Pjfc(2) } 

M—1 

= ^E^e"**(-D-a)^+wjfc)~'''‘Afc(2;ri,..., r^i;ui,..., (a^i)-\-P(z)e'^>^% 

where pk(z) and P{z) are polynomials of degree r,. — 1 at most. It 
follows that (19) is correct for k = 1, for arbitrary m, and its truth 
for A: = 2 follows from the previously noted symmetry of R{z) in 
the pairs wk, rk. 

For fixed complex numbers wi,..., w„, our considerations up to 
this point are valid for all the functions ^( 2 ; ri,. . . , r„; wi, . . . , a,„.) 
corresponding to arbitrary sets ri,. . . , r„ of positive integers. 
We now specialize the parameters so as to obtain a collection of 
functions depending on a single parameter p. 

For h and k in the sequence 1, 2, . . . , m, define 

1 if/i = jfc, 


hk = 


!0 \ih9^k, 


and put 

Rh{z)=Rh{z]p) Wi, . . . , aj„) ^ P + 5mA: Wl, . . . , £«J„), 

■dAfc(2) — i4Afc(z;p;a)i,..., u„) =Ak{z\p-\-hxh,, p+imA; «i,..., w,„). 

Here p is a fixed but arbitrary positive integer. We form the square 
matrix 

A{z) = Uaa(2)), a, A: = 1,.. . , m, 

having deterimnant D{z). Let the minor determinant of ^^*( 2 ) in 
i^( 2 ) be Z)Aife( 2 ). 

polynomial in 2 of degree p + Sa* - 1 , and the 
coefficient of the highest power of 2 is, by (19), 

1 ^ 

n (o>k - wi)-^**'. 


(p + hk — 1 )! 1-1 



180 IRRATIONALITY AND TRANSCENDENCE [cHAP. 5 

Hence in the expansion of D{z), the term formed from the elements of 
the main diagonal will be of higher degree than any other term, and 
D{z) is therefore a polynomial of degree mp with the coefficient of the 
highest power of z equal to 

^ n n (co, - c,)-. 

(p!)’” / = ! 

If, on the other hand, we solve the system of equations 

m 

L = RKiz), A = 

Ai * X 

for e"**, we obtain the identity 

h^l 


Since the expansion of Rh(z) begins with the term z”'^l{mp)\, the 
polynomial D{z) is divisible by Hence 

.Tnp w 

= 77^ n n («* - , (20) 

(p!) fcal 1=1 
l^k 


and D(z) vanishes only at 2 = 0. 

Letci, C 2 ,... be positive constants depending only on m,ui ,... ,a»m. 
(In particular, they must not depend on p, which will eventually be 
large.) 

Examination of equation (16) shows that for 1 < h < m, 


Rh(l) = 0 

From (19), we obtain 



Ahk(^) — 


In 

1-1 

tpik 





,P+8aA —1 


(p+^A*—1) 1 


where the sums need not be extended past the index p. Let 


n = n («* — wa), 

A,A-1 

A<A 

so that n can be regarded as a polynomial of total degree 7n(7n - l)/2 



5-5) 


THE EXPONENTIAL FUNCTION 


181 


in ..Wm, ^’ith coefficients in Z not exceeding 2”^ in absolute 

value. Since no exponent p 4- + X/ in the above sums exceeds 

2p + 1, the expression 


is a polynomial in uij,. . . , of total degree 



2 


(2p+ 1) 


at most, whose coefficients are rational integers of the order of 
magnitude 0(c2*’p!). Finally, we put 


so that 




ib-t 





r* = fi2‘^'p!/?A(l) = 0 



The quantities ri,..., r„, are linear forms in the numbers e“*, and 
they are linearly independent, in the sense that no linear combination 
of the vectors (oai, . . . , oa^), for \ < h < m, is the zero vector. 
This is equivalent to the assertion that Z>(1) 0, which follows 

from (20). 


Theorem 5-7. Suppose that wi, ...,all lie in an algebraic 
number field K of degree g, and let 

m 

h — 1, . .. , /i, 


be M independent linear forms in e'^'.g"" with coefficients b^k in Z. 

Suppose that 


and put 



< u < m, 



b 


max 

1<A<m 

l<k<m 


(I^aaI). 


5 = max (\Lh\). 
1<A<m 


Then to each « > 0 there corresponds a bo{e) such that s > b~^* if 
b > 6o(<), where 


mug 

ug - m{g - 1) 


r 


- L 



182 IRRATIONALITY AND TRANSCENDENCE (cHAP. 5 

Proof: By a well-known theorem on independence,* the n forms 
Li,.. ., together wth m — n of the forms rh, which we may 
designate by .. ., are independent. Hence the determinant 

• « 

4 » 

• • • ®Ain—pm 

^>11 • • • ^Im 

• • 

* • 

« • 

6^1 . . . ^itm 

is not zero; it is obviously a polynomial in wi, of degree at 

most 


m{m — l)(2p -j- l)(m — n) 



with coefficients in Z of the order of magnitude 0(c4^pr*~^b^). It 
follows, first, that there is a rational integer C 5 such that Cs^’A is an 
integer of K, and, second, that 

[a] = 0((<7 + irc 4 '’p!'"-'' 6 V) 

where Cg is an upper bound for the various numbers IwTI. Hence if 
A, a", ..., A^*'^ are the field conjugates of A, we have 

I ^ C 6 ^(C 5 ^A") • • ■ (C 5 *’Af^>) 

A N(c 5 '’A) 

Moreover, using subscripts on A to indicate minors, we have 

A/k = 0(c9'’pr"'*“V), A,*rA, = O(cio'’pr^b^), 

for I < I < m — n, 

Art = OCci'p = 0(c.2'’p!’"-'‘fc'-'s), 

for m — M + 1 < ^ ^ 

* Cf. B. L. van der Waerden, Modem Algebra (English edition, trans¬ 
lated by Fred Blum from the second revised German edition), New York: 
Frederick Ungar Publishing Co., 1949, Vol. 1, p. 101. 





183 


5-5] THE EXPONENTIAL FUNCTION 

Using the identity 

m —41 m 

Ae"‘= Z (-l)'+‘A,tr,,+ Z {- 
it follows that 

or 

> Ci5 - C 16 • (22) 

From the inequality (21), the exponent m(g ~ 1) — fig is negative. 
Hence the quantity 

C 13 '’ 

p jMff—m{p—1) 

may increase for small values of p, but it tends to zero as p increases 
indefinitely. At any rate, we can say that for 6 larger than some cn, 
the smallest value of p for which 

Y ^ cuCi3'’p!”'^*>"«6«' 

is so large that els'* is negligible as compared to the factorial, and for 
such p we have the asymptotic relation 


log pi 




By (22), for b > 6o(0. 


where 


pg - m(g - 1) 


5 > 


TT log b. 


T 


,g-i + Ju t . - 

pg — m{g — 1) 


This proves the theorem. 


mpg 

pg - m(g - 1) 



Theorem 5-8. Suppose that i?i, ..., are elements of an alge¬ 
braic number field K of degree g, and that they are linearly inde¬ 
pendent over the rationals, so that no relation of the form 

+ * • • + dffi&jf = 0 

holds with di,.,.,dif ralioTud and not all zero. Then if the coeffi^ 



184 


IRRATIONALITY AND TRANSCENDENCE 


[chap. 5 


dents in the linear form 

Ml Mn 

L= T. ■■■ L 6x....xy‘'’‘+"’(23) 

Xi=0 XAr=0 

in the quantities are rational integers with 

b = max (|6xi...xjv|), 

there is a constant T, depending only on g and N, such that for suffi- 
dently large b, 

\L\ > 

Proof: Let /in be positive integers, and consider the 

quantities 

T _ H- j 

l\ ^ 0> ) fi\ f . . . , — 0^ > MNf (2^) 

their number being 

M = (mi + 1) • ’ • (mn + !)• 

If we introduce the exponential factor in (24) inside the summation 
in (23); we see that the various Lh-iff may be regarded as linear 
forms in the quantities 

where 

wxi—Xftr = ^1*^1 + • • ■ + h/vdjv, 

Xi = 0, 1, . .. , + Ml; ■ • ■ J An = 0» L • ■ •) "b 

the number of w’s being 

m = (Ml + Ml + 1) ■ • ■ (-^JV + MN + 1)- 

The numbers o-x.-x;, are distinct on account of the independence of 

over the rationals, so that we can speak of the mdepend- 
ence of’the forms To see that as a matter of fact they are 

independent, order the subscript sets . Xxr and i,... (at by inter¬ 
preting the X’s and I’s as digits in the base g, for some sufficiently 
large g Then there cannot be a linear relation among the coefficient 
vectors of any set of forms, since the aix,...xv with largest subscnpt 
occurs only in the form with largest subscript. FinaUy, there 

are positive constants « and d which are independent of the coefh- 



5-51 THE EXPONENTIAL FUNCTION 

cients bxi...Xiv> for which 


185 


a < 


It follows from Theorem 5-7 that if 




w ^ ^ + >^, + 1 9 

M .-1 Mi+1 


9 - 1 


(25) 


then for b > b{t), 


m > 6— 


where 


T = 


+ + • • • (A/Ar + MAr + l)(Ml + l) • • • (mA^ + 1) 


+ ■ • • (MiV+1) —(ff—1 )(A/i+mi + 1) 

Condition (25) is satisfied if 


(A/at+mn+I) 


-1 


since then 


A/. 


L(^) - 'J 


With this choice of we have 


T " 


n(i + ^) 

t-l \ Hi + 1/ 


29 




1 - - -11 n + 

9 


, ^ 2^-1 
— 1 Sm --:-1-1 


1 - 


ff - 1 2g 


9 2g - I 

= 2gH~ (26) 

Since > 1 and N > 1, we have Hi > Mi > 1. Since [xj + 1 < 2x 
for X > 1, we have 


n<U 


2Mi 





186 IRRATIONALITY 

and we have the theorem with 


AND TRANSCENDENCE 


[chap. 5 



Taking = 1, we have 

Corollary 1. If d 9 ^ Q is algebraic, is an S-number, and in 
'particular is transcendental. 

For t? == Trt, = — 1 is not transcendental. Hence 
Corollary 2. tt is transcendental. 


We also have the following result, first proved by F. Lindemann 
in 1882. 

Corollary 3. // t?i, ..., ds are algebraic and are linearly inde¬ 
pendent over the rationals, then e^\ ..., e^^ are algebraically inde¬ 
pendent over the field of algebraic numbers, that is, there is no poly¬ 
nomial P{z \,.. ., zn) y^ith algebraic coefficients not all zero for which 

P{e^\ ...,e^n = 0 . 


Finally, for = 1 the brackets can be omitted in the definition of 
then ~ (2g — l)Mi and 

T < 2ffM - 1 < 2 j 7 ((2ff - l)Mi + !)-! = 2g{2g - l)Mi -j-2g-l. 


Corollary 4. If d ^ 0 is algebraic of degree g, then the function 

^{n, t) = 

is a transcendence measure for e^. 


5-6 A theorem of Schneider. In addition to the Liouville numbers 
and values of the exponential function, many other specific numbers 
are known to be transcendental. To indicate the type of results 
known, we mention the following: 

(a) The Bessel functions Jq{x) and Jq{x) are transcendental for 

algebraic x 7^ Q. 

(b) If a and ^ are algebraic, a 0 or 1, and jS is irrational, then </ 
is transcendental. (In particular, = (—1) ' is included.) 



A THEOREM OF SCHNEIDER 


187 


5-6) 


(c) At least one of the numbers g 2 , ^ 3 , W 2 associated with a 
Weierstrass ^<>-function is transcendental, and if 92 and ^3 are alge¬ 
braic, at least one of z and ^(z) is transcendental. 

(d) If/(a:) is a polynomial whose value is in Z for argument in Z, 
and /(x) > 0 for X > 0 , then the number 


0./(l)/(2)/(3)..., 


formed by juxtaposing the decimal representations of the values/(x), 
is transcendental. (An example is the number 0.1361015 . .., 
generated by/(x) = (x” + x)/ 2 .) 

(e) If tj is a positive quadratic irrationality, then the number 


H [nu>]z 

n-O 


n 


is transcendental for algebraic z ^ 0. 

On the other hand, it is not known whether the following numbers 
are transcendental: 


(a) y 


= lim ( 

fl—» • \ 


-I- 


n 



(b) !-( 2 n + l) 

A 

(c) r(x) for algebraic x not in Z, 

(d) e T, ev. 

The methods used to prove what little is known about specific 
transcendental numbers show considerable variety, both in technique 
and conception. T. Schneider has recently shown, however, that 
several results which earlier required separate proofs can all be ob¬ 
tained from a single theorem. This theorem says nothing directly 
about transcendental numbers; rather, its sense is that if several 
transcendental functions assume algebraic values at a large number 
of points, then they must either have large rates of growth or be 
algebraically dependent (as functions). The prototype of Schneider's 
result, proved by G. Polya in 1920, asserts that if f is an integral 
transcendental function which assumes values in Z for z = 0 , 1 , 2 ,..., 
then 


lim sup 


^f(r) 





2 ' 



188 IRRATIONALITY AND TRANSCENDENCE [CHAP. 5 

where, as usual, 

M{r) = max ([/(z)i). 

1*1 


There have been many refinements and extensions of Pdlya’s work, 
of course; we mention only that by A. Gelfond in 1929, where this 
kind of theorem was first used for transcendence investigations. (His 
result was that is transcendental for algebraic o 0, 1, if ^ is an 
imaginary quadratic irrationality.) 

In this section we shall prove Schneider’s theorem, and in the next 
we shall apply it to the numbers </. (The facts mentioned above 
concerning the g?-function can also be deduced, but the requisite 
preliminaries preclude doing so here.) Since the statement of the 
theorem is complicated, we first introduce some notation. 

By the order of an entire function f{z) we mean, as usual, the 
quantity 


lim sup 
«-*• 


log log M{R) 

logR 




if /(g) is of order n, then 

m = 0(6'^“'^) 


as jg] = » 00 , for every fixed € > 0. Let fi, r 2 » • • • infinite 

sequence of complex numbers. Designate by 

Zo{m) — Zoy ..., ZkM = 

the distinct numbers among fi, -.., fm. by Z,(m) + 1 = ?* + 1 
the multiplicity of occurrence of z„ among fit • • •» Thus 

L (k + !) = '«• (27) 

x-0 


Let r(m) = r be the radius of the smallest circle about the origin 
which contains Z\y... ,Zk, and put 


a 


. >gm 

= lim inf ■;-» 



SO that a < oo . Let 

I = max (io» • • • > 


Finally, let iiC be a fixed algebraic number field of degree g, and, as 



A THEOREM OF SCHNEIDER 


189 


6 - 6 ) 


always, let be the maximum of the absolute values of the con¬ 
jugates of a, for a in K. 


Theorem 5-9. Letfi{z ),... ,/n(2) be meromorphic fundions vnth 
the property that for each m, the numbers 





are in K. Let be positive rational integers such that all the 

numbers 

(z^), X = 0,...»f,, X = 0,. . ., fc, V = 1, . . . , n, 


are integers in K. Suppose that 




For each v, if f,(z) is entire let it be of order and otherwise suppose 
that there is an entire fundion G,{z) of order p, such that G,(z)ft,(z) is 
entire and also of order p,. Suppose that 


and put 




Suppose finally that 

log log max (|G,(z,)r^) 

' = 1."■ (31) 

and 


log log max H.(z.)) 

0<tt<k 

lun sup- 


m* 


log m 


< Vn *' = 1 , 


Thenfi, ■ • • ,fn are algebraically dependent over K. 
Proof: We form a polynomial 

«■(*) = E ■ ■ • E . • //"(J), 

n -0 r« -0 


., n. 

(32) 



190 


IRRATIONALITY AND TRANSCENDENCE 


[chap. 5 


and seek to determine the coefficients Cr,...T„ so that has a zero of 
order + 1 at z^, for x = 1,... , k. Here the numbers k, zj,, and 
are all defined in terms of the sequence Ti, fa. • • • and an index ?n, as 
explained earlier; m is fixed, and will be specified more exactly later. 
The conditions imposed on $ require that all the numbers 

4*^^^ (z,), X = 0, . . . , X = 0,...,k, 

shall vanish, and this in turn yields a set of m homogeneous linear 
equations of the form 

^ 0JfiOry..r„ ~ 0, = 1, . . . , 771, (33) 

T 

in the ( = (<i + 1) • • • (<n + 1) unknowns Cry-.m- course, the 
numbers oj^ also depend on ri,... , t„.) We put 

*' = 1 . •.. . ( 34 ) 


so that 

^ = (^1 + 1) ■ ■ * (^n + 1) ^ 




1 /n 




The coefficients in equations (33) are by assumption numbers in 
K, and after multiplication by the rational integral factor 


n (H.(0.))^' 

r-l 

they become integers, say of K. The size of the coefficients is 
determined in part by this numerical factor, in part by the values of 
the /„ and their derivatives at the various points z„, and in part by 
the numerical coefficients introduced by differentiation. In the esti¬ 
mate (36) below, the second of these is accounted for by (32). The 
third depends only on the set of exponents and the order 

of the derivative considered, and so can be computed from the fact 
that the sum of all the coefficients in the expansion of 


by the product formula is 


n 


E r. 

f ^1 






- X +1 


) ^ fe '■) 



191 


5-61 


A THEOREM OF SCHNEIDER 


We thus obtain the bound 

rs5 < n (//.( 2 ,))'- ■ n exp ■ ( Z » (36) 

r-l .--1 V = 1 / 

where €, > 0 and €, —♦ 0 as m —» <». (Hereafter we designate any 
quantity with the latter properties by «, and any positive integer 
independent of in, I, and r by 7 .) 

It follows from the inequality (30) and the definition of rj, that 

+ ■ ■ • + IJn < “ 1. (37) 


so that, by (34), 



< (y^O^- 


Using this, together with (32) and (36), we obtain 


< ( 7 m)* exp 2 Z 


By (29), < 7 ”. By (37), = n — 1—5 with 5 a 

positive constant; hence 


“ (1 + *7i + ■ ■ ■ 4* »In) + « = 
n 



We henceforth require m to be so large that 

5 

€ < - • 
n 



Then 


Rijn < 7 ”. 



Using this and (35), we shall now show that there are coefficients 

Cri—m satisfying (33) which are integers in K, are not all zero, and are 
such that 

1 • (40) 

To simplify the notation, arrange the in some fixed linear 

order, and rewrite (33) in the form 


Z iljirC'r = 0 , H = I, ... ,m. 

T^l 


. • •., Pfl be an integral basis for K, let he Z be positive, and 


L6t Pi 



IRRATIONALITY AND TRANSCENDENCE (CHAP. 5 

let B be the set of integers in K of the form 

^iPi + • • • + 

where the 6’s range independently over the rational integers such that 
|&[ < h. For each set of elements Xi ,..., Xt of B, put 

t 

Vlt ~ ^ ^nrXrt M ~ !»•••) 
r-1 

This defines {2h + l)gl m-tuples y\,, ym, not necessarily different 
from one another. Also, since 

IX^I < ‘yK (41) 

we have from (39) that 

< -Th. (42) 

Each number y^ has a basis representation c\p\ + • • • + Cgpg] 
similar representations, with the same c,- and with the p,- replaced by 
their conjugates, hold for the conjugates of The determinant 
formed from the p,- and their conjugates is not zero, so that it is 
possible to solve the g equations defining and its conjugates for the 
numbers c,-, giving each c,- as a linear expression in the conjugates of 
with coefficients depending only on K. From (42) it follows that 
for t - 1,. .., g, 

|c,| < (43) 

There are, however, exactly ( 27 ”*^ + !)'' different integers of K 
whose basis representation satisfies (43); therefore there are at most 
{2y^h -h 1)*^ different systems yi, . - . , ym- If 

(27"A + 1)*^ < (2A + I)*"'. (44) 

then two systems y\,... tVm corresponding to two different sets 
Xi,.,.,Xt coincide, and the respective differences Xi - Xi , ..., 
Xt - X/ constitute a solution of (33). These differences, which we 
call are not all zero, and by (41), they satisfy the 

condition 

< yh. 

By (35), ( > 2m, so (44) holds if 

(2y”^h + 1)” < (2A + 1)^” 

which is clearly true ii h = 7 ”*- But then (40) holds. 



5-6] 


A THEOREM OF SCHNEIDER 


193 


We now designate by mo a fixed value of m such that (38) holds, 
and by ko and the corresponding values of k and and define ^ 
to be a fixed function corresponding to mo and having all the proper¬ 
ties described up to this point. We are now able to perform an 
induction. 

We know that possesses mo zeros, if each is counted with its 
proper multiplicity. It is asserted that if mo is sufficiently large, then 
^(z) vanishes at all the points fi, r 2 , • • ■■ This is proved inductively 
by showing that if 4*( 2 ) =0 for 2 = fi, . .. , fm, wdth m > mo, then 
also 'l>(rm+i) = 0, if mo is sufficiently large. More precisely, we 
assume that has a zero at 2 «(k = 0,. . . , k) of order i, 1, with 

k 

L (^- + 1) = m > mo, 

x-O 

and shall deduce that 4> has a zero at fm+i = T of order A -j- 1, where 


A = 


0 

L + 1 


if f = Zk+i 5*^ 2 , for X = 0, ... , k, 
if f = 2 ff and 0 < <r < A:. 


Here k — k{m) and = /,{m). 
Put 


G{z) = UGMz)-, 


then G{z)^{z) is an entire function which vanishes at the same points 
2 , as‘I>( 2 ), and to the same order, by (31). We also put 


Q(z) = n (z - 


By Cauchy^s theorem, 

d^{G(z)^(z)/Q{z)) 
dz^ 

Here V is the circle 

I2I = /?! = R^, 


L 


G{z)^{z) dz 


Q{z) (z - f)^+‘ 


!?> 1, 


(45) 


where R = r(m + 1) if a < <» (we recall that r(m) was defined 
earlier as the radius of the smallest circle about 0 containing 
^Or • • • , 2 fc), while if a = «, /? is so chosen that 


« > r(m + 1), 


logm -h 1 

hm - = 00 


m—• • 


log/? 


lim R ~ <x>. 


m* 



IRRATIONALITY AND TRANSCENDENCE [CHAP. 5 

Since 4> has a zero of order A at the left side of (45) is simply 

G{z) 


so that 


Q(z) 






Qin 

Gin 


L 


l*=f 
Giz)^iz) 


dz 


(46) 


Qiz) (z - f)"+^ 

We shall use this representation to estimate l4>^^^(f)l. By the in¬ 
equality (40), we have 


max \Giz)^iz)\ < < 7 ”® exp 
W-Ri 



where e —♦ 0 as » go , or equivalently as m ^ 00 , By the defini¬ 
tion (28) of a, 

or 

Ri < 


even for a = 00 , Hence 

max \Giz)^iz)\ 

1*1=Ri 


< 7 ”® exp 



since it is easily seen from the definition (34) that t < 7 ”®. From 
(34) we also have that 


r-l P-l 




2 I l^jfi (1+’}|+* * •+«Jn) / n—rl"* 





We may suppose, with no loss in generality, that each ijv < 1. 
For suppose that i 7 n> say, is larger than 1. Then in analogy with (30), 


Ml + • • • + Mn—1 

n - 2 




and all hypotheses of the theorem are satisfied by the n 1 functions 
/,(z),.. . But if h(z ),... ,/n-i(2) can be shown to be 



5-6) 


A THEOREM OF SCHNEIDER 


195 


algebraically dependent, then fiiz), • ■ • ,fn ( 2 ) must also be dependent. 
Consequently, if we put t? = 1 -H 5/2n, we have 


(t? - 1) max (tj,) < — • 

l<r<n 


If mo, and hence also m > mo, is so large that 


< < ;r» 

2n 


(47) 


n 


then 


and 


£ < nm, 

»°>i 


(48) 


(49) 


(50) 


(51) 


max 1(?(2)4 »(z)| < 

111 - fti 

Continuing the estimation of the right side of (46), we notice that 
since t? > 1, and since R grows indefinitely with m, it is possible to 
choose mo so large that 

• I I 

mm 12 - 2.1 > — > 

1*1 - Ri ^ 

for X = 0, . . . , A:, and also 

min I 2 - f I > ~ ■ 

— ) 

2 / 

Since f and all the z, lie in the disk lz| < R, we have 

If - Z.I < 2R, 
so that [Q(f)| < (2i?)'". 

Finally, we see from (31) that 

lG(f)| > exp^- E > 7 -™. 

Combining the relations (48), (51), (52), and (53), we have 

< (2Rr ■ 7”a'' ■ 7"’ ■ ' • Ki 

<-•©" 


(52) 


( 53 ) 



196 


IRRATIONALITY AND TRANSCENDENCE 


Since by hypothesis 


(chap. 5 


the inequality 
holds; hence 


I = max (i*, A) < 
o<*<* 


m + 1 

log (m + 1) ’ 


A^ < 7” 





Recalling that <(»(«) is a polynomial ... ,/„(«) with coeffi¬ 

cients Cr which are integers in K, and that all the derivatives of 
/i(2),... ,fn(z) up to order A have values in 2f for z = we see 
that also (f) is a number in K, and that the product 




r~l 


is an integer in K. By the same reasoning as was used in producing 
the estimate (36), we have 

n ■ i$w(r)i < y"" ■ n • (z 

r-l ».-l V-1 / 

■ n exp ■ n (i,. + 1). 

K-l 

The factor comes from the estimate (40) for while the last 
product is the total number of terms in 4>(z) itself, which was an 
unnecessary factor in (36), where we were estimating the terms in a 
derivative arising from a single term in #(z). By arguments used 
previously, it follows from the last inequality that 

n < 7”. 

r-l 


Combining this with (54), we have 

I N (f) n I < (S5) 

and the upper bound here is smaller than 1 for m sufficiently large, 
say m > m\. Hence if mo is so large that > mi, and the inequali¬ 
ties (47), (49), and (50) hold, then it follows from (55) that 

= 0, 


as asserted. 



197 


A THEOREM OF SCHNEIDER 

To complete the proof of Theorem 5-9, we shall make use of the 
following general considerations. Let <p(z) be analytic in the disk 
bounded by a circle r, and let i,,..., ip be interior points of this 
disk. Then <p(z) has an expansion 


¥)(z) = Oo+ai(z-J:i)+a2(2-J:i)(2“3:2)H- 

-\-ap-i{z—xi) • • • {z—Xp-i)-\-(z — xi) • ■ ■ iz—Xp)Rp{z) (56) 

with constants oo,.. -, and a function Rp{z) regular in the disk. 
In fact, if we put 


1 . [ 

2injr (1 — Xi) ■ ■ • (t — Xfl+i) 




for 5 = 0 ,..., p — li and 


-f— 

2iri Jr (t - 


ip{t) 


z){t - Xi) • ■ ■ (< - Xp) 


dt = Rp{z), 


(58) 


then 


ip{z) — (oo H" "b ‘ ■ "b ®p—1(2 ~ ^1) ■ ■ ■ (z — 1)) 

2 — Xi 


= -— 

2TiJr\t-z t-xi {t- 

) 


xi)(f - X2) 


• • 


(z - Xi) • • • (2 - Xp-i) 
(f - Xi) • • • (< - Xp) 




2in Jr {t - 


Z - Xi) • • • (2 - Xp) 


z){t - Xi) • • • (< - Xp) 
= (z - Xi) • • • (z - Xp)i2p(z), 




and it is clear from (58) that Rp{z) is regular inside T. 

We apply this with ip{z) = 4»(z)G(z), the “interpolation points” 
Xi,..., Xp being fi,..., fmj in this order. For T we choose the 
circle Izj = R\ = R^ with d > 1, where R > r(m), lim 72 = «, and 

m—• • 


lim inf 


log m 
log ft 


a. 


Since4>(z) vanishes at all the points fi, f 2 » • • • 1 the integrand in the 
expression (57) for a, is regular in T, so that 

09 = 0 for g = 0 ,..., m — 1 . 



198 IRRATIONALITY AND TRANSCENDENCE 

Hence for fixed z with 2 fi, , 

(?(0^(0 di 


(chap. 5 


G{z)Hz) = lim {q„{z) — [ 

m—♦ tc \ ZlTl Jr 


Qmit) 


dt \ 
i-z) 


where 


k k 

Q^{z) = n (s - 2.)'”+‘, E a, + 1) = m. 

*=0 


*=0 


As in the derivation of (54), we have 


n 


|(7(2)'!>(2)I < r"” n «. + 1) exp 


(J: <.«!'"+•) (2r)" 




< 




Since this inequality holds for arbitrarily large m, and since R in¬ 
creases indefinitely with m while y does not, it must be that 

G(zm^) = 0 . 

Hence ^>( 2 ) vanishes for all z, which is the assertion of the theorem. 

5-7 The Hilbert-Gelfond-Schneider theorem. As an application 
of Schneider’s theorem of the preceding section, we now prove 

Theorem 5-10. // a and b are algebraic numbers, b is irrational, 

and a is neither 0 nor 1, then a^ is transcendental. 

This theorem settles a question raised by Euler concerning the 
arithmetic nature of the logarithm of a rational number to a rational 
base, and repeated in more general form in the seventh of Hilbert’s 
famous list of 23 outstanding problems which seemed to him to be 
both difficult and important. The list appeared in 1900, but it was 
not until 1929 that Gelfond made the first contribution to the solution 
of this problem. Further partial results were obtained by Kusmin. 
Siegel, and Boehle, and in 1934 complete proofs were given almost 
simultaneously by Gelfond and Schneider. As mentioned earlier, the 
proof to be given now is most nearly in the spirit of Gelfond’s 1929 
paper; it should be instructive to the reader also to examine the 
original complete proofs by Gelfond and Schneider. 

We apply Theorem 5-9 with n = 2,/i(2) = a%f 2 {z) = 2 . a^d 
f = u-\-vb, where u and v range over the positive integers. On 



199 


6-7 


THE HILBERT-GELFOND-SCHNEIDER THEOREM 


account of the irrationality of 6 , the numbers are distinct; they 
are to be ordered by the size of u + y, and otherwise arbitrarily. 
Suppose that in the sequence fi,.. ■ , fm, all the numbers occur for 
which u -H y < d, and possibly some (but not all) of those for which 
-f y = d + 1. Then clearly 


d{d - 1 ) 
2 


< m < 


d(d+ 1) 

■ > 

2 


while 

r = max (|u + ybl) < (d + 1)(1 + 1 ^ 1 ), 


and (taking u = d — l,v — ± 1 ), 

r > d — 1 — 16|. 

These inequalities show that yir^ < m < 72 *'^. for some positive 71 
and 72 - Thus 

a — lim T- = 2 . 

«-.« log r 

By the choice of fi{z) and / 2 ( 2 ), mi = 1 and m 2 = 0. and the 
inequality (30) holds. Since a* and z are entire, (31) is without force. 
If we suppose that z and a* are elements of an algebraic number field 
iC for z = b, then / 2 (f) = u vb and /i(J') = o“(a^)*’ are also in K 
for positive integral u and v. (We need not examine the derivatives, 
since fi, r 2 . • • • are distinct.) Moreover, if c is a positive rational 
integer such that ca, cb, and ca^ are integers of K, then we can choose 

Hi(zJ = and H 2 M = c 


for = u + vb. It follows from this and the definitions of/i( 2 ) and 
f 2 (z) that the inequality (32) holds. Thus, under the assumption 
that o, b, and a!* are all algebraic, all hypotheses of Theorem 5-9 are 
satisfied, and it follows that z and a* are algebraically dependent. 
This being palpably false, the above assumption cannot be main¬ 
tained, and the theorem is proved. 


PROBLEM 

Show that e* is transcendental for algebraic 0^0. (/ftnt; Choose 
fi{z) = e‘,ftiz) = 2, and 

f(n-l)*+l = • • • = = tv9, 

for n = 1, 2, ....) 



IRRATIONALITY AND TRANSCENDENCE (cHAP. 5 

REFERENCES 
Section 5-1 

I. Niven, Bulletin of the American Mathematical Society 63, 509 (1947). 

Section 5-2 

J. Liouville, Comptes Rendus Hebdomadaires dee Seances de I’Acadimie 
des Sciences (Paris) 18, 883-885, 910-911 (1844); Journal des Mathe- 
matiques Pares el Appliquies (Paris! 16, 133-142 (1851). 

Sections 5-3, 5-4, 5-5 

Most of this material is adapted from Mahler, Journal fur die Reine und 
Angewandle Mathematik (Berlin) 66, 117-150 (1932). For the existence of 
C/-numbers of each degree, see LeVeque, Journal of the London Mathe¬ 
matical Society 28, 220-229 (1953). 

Section 5-6 

Siegel’s work on Bessel functions is to be found in Ahhandlung der Kgl. 
Preussischen Akademie der Wissenschaften (Berlin), article no. 1, 70 pp. 
(1929). The first result stated concerning the ^-function is due to Siegel, 
Journal fur die Reine und Angewandle Mathematik 167, 62-69 (1932); the 
second to Schneider, ibid., 172, 70-74 (1934). The transcendence of deci¬ 
mals formed from polynomial values was proved by Mahler, Proceedings 
Konink. Nederlandsche Akademie van Wetenschappen (Amsterdam) 40, 
421-428 (1937), and that of the series by Mahler, Malhemalische 

Annalen (Leipzig) 101, 342-366 (1929) and 103, 532 (1930), and Malhe- 
matische Zeitschrift (Berlin) 32, 545-585 (1930). 

Schneider’s Theorem 5-9 appeared in Matkemaiische Annalen 121, 
131-140 (1949); he includes a bibliography of work on integral-valued 
functions. Polya’s work appeared in NachrichUn von der Gesellschaft der 
Wissenschaften zu Gottingen, pp. 1-10 (1920), and Gelfond’s in Tdhoku 
Mathematical Journal (Sendai, Japan) 30, 280-285 (1929). 

Section 5-7 

Hilbert’s problems appeared in Nackrichten von der Gesellschaft der 
Wissenschaften zu Gottingen, pp. 253-297 (1900). The complete solution 
of the seventh was given by Gelfond, Comptes Rendus de I Acadimie des 
Sciences de V U.R.S.S. (Moscow) 2, 1-3 (in Russian), 4-6 (in French) 
(1934), and Bulletin de VAcadimie des Sciences de I’ U.R.S.S. (Leningrad) 
7, 623-640 (1934); and by Schneider, Journal fur die Reine und Ange- 
wandte Mathematik 172, 65-69 (1934). There is an excellent exposition of 
Gelfond’s method by E. HiUe, American Mathematical Monthly 49, 654-661 

(1942). 



CHAPTER 6 
DDUCHLET’S THEOREM 


In this chapter and the next we shall consider various questions 
concerning the distribution of the rational primes. This is a large 
and difficult field, and we shall be able to obtain only a few of the 
important results. The first of them, to which this chapter is de¬ 
voted, is Dirichlet’s famous theorem that there are infinitely many 
primes of the form km + I, where k and I are fixed integers which are 

relatively prime. 


6-1 Introduction. Although proofs of certain special cases of 
Dirichlet’s theorem are given in elementary texts,* the methods used 
cannot be generalized to prove the full theorem. To get an idea of 
the method used by Dirichlet, let us consider the question of the 
infinitude of the set of primes of the form 4A: 1. We base the dis¬ 

cussion on the Hiemann ^-function, defined for s > 1 by the equation 




This is perhaps the simplest of all the Dirichlet series 



which play an important role in prime number theory. One reason 
for their importance is exhibited in the following theorem, which 
gives a relation between the set of primes and the set of positive 
integers. 

Theorem 6-1. For s > 1, 

rw = n(i-^y‘- (!) 

Proof: In less abbreviated form, the assertion is that 

/ 1 ^ 1 
Hm n (l-;) = lim L — • 

AT—* • p< AT \ P / • n •• 1 ^ 

• See, for example, Volume I, pp. 9, 46, 59. 

201 



202 

The relation 


dirichlet’s theorem 


(chap. 6 


1 


= I z + + 


cc 


= L X'* 

n^O 


I — X 

holds for |xl < 1 ; since \p-‘\ < 1 , we have 

n (1 - p-^r^ = n (i + p-* + p-^> + ■■•). 

P<N r I j 

Multiplying out the product on the right, we obtain terms of the 
form n , where n runs over the integers composed exclusively of 
primes not exceeding N . Moreover, each such n occurs exactly once, 
by the Unique Factorization Theorem. The multiplication is per¬ 
missible, since the series involved are absolutely convergent, and 
the terms can be arranged in any order. Thus 

n (1 - = L'n-*, 

p<N 

where the accent indicates a summation, in the natural order, over 
all n such that p\n implies p < N. In particular, the sum contains 
all terms n”® for which n < N. Hence 

n (1 - p-®)-' = L + L' n-®, 


and 


P<N 




n>N 


L 


since s > 1. Thus 

as —> 00 , and 

lim n (1 

P<N 


Z'n-^oiD 

n>N 


p ®) * = lim 'E n ‘ = f(s). 

JV^«e n"“l 


To see exactly how f(s) behaves as 1"^, we use the following 
standard result. * 


Lemma. Suppose that Xi, X 2 ,.. • is a nondecreasing sequence tend¬ 
ing to infinity, that Ci, C 2 ,... is an arbitrary sequence of real or 
complex numbers, and that f{x) has a continuous derivative for 
X > Xi- Put 

C(x) = L Cn. 

n 


Xn<* 


* See, for example, Volume I, Theorem 6-15. 



6-11 


INTRODUCTION 


203 


Then for x >\i, 


E cj{\n) = C{x)f{x) - 


Xn<X 


rcior 

yx, 


(0 di. 


Applying this with = n, = l,/(x) = x ", we obtain 


5.i 


for X > 1. If we put (x) = x — [x], we have for s > 1. 


V1- r^- ^ 

n- Vi 


rdt rn) 

Vi i' Vi 




(x) 
x' 


s — 1 (s — l)x' 




(0 

s+l 


dt “h 


1 


(■r) 


X—* x" 


Letting x increase without bound and noting that 0 < (x) < 1, 
we have 

(0 


r(s) = 


S - 1 Vi 


dt. 


( 2 ) 


This expression for f(s) agrees with the earlier definition for s > 1, 
but it is also meaningful for 0 < s < 1, since the integral converges 
for all s > 0. It may therefore be thought of as defining {"(s) for 
s > 0, s 1. At any rate, (2) shows that 


lim r(s)(s - 1) = 1, 

and a fortiori that 

lim f(5) = 00 . 


(3) 

(4) 


For the remainder of this section, let q and r designate primes of 
the forms -|- 1 and 4A: — 1 respectively. Define the function 
x(^) by the equations 


x(l) = 1, x(g) = 1, x(r) = -1» x(2) = 0, 

x(mn) = x(wi)x(w) for every pair of integers m, n. 

(A function which satisfies the last of these conditions is said to be 
completely multiplicative; it is entirely determined when its values for 
all prime arguments are known, since x{p“) = (x(p))“.) Inasmuch 



204 


dirichlet's theorem 


[chap. 6 


as n = 1 (mod 4) if and only if 2|n and the total number of r’s 
dividing n is even, we have 



0 

1 ) 


We now investigate the function 


if 2|n, 
if 2\n. 


L(s) = £ 

n *»1 


x(yt) 

n* 


If we write Xa„ « Y.K to mean that la„ 
then 


ce 



n 


x(w) 

n® 



< bn for n = 1, 2, •. 




for 5 > 1, so that the series for L{s) is absolutely convergent for 
5 > 1. More than this is true, however: the series for L(s) converges 
for s > 0. For we note that for any n > 0, 

x{n) + x(« + 1) + x(” + 2) + x(tt + 3) = 0, 


so that we have 


N 

Z x(n) 

n —1 

* 8 , Ji , , 

= Z x(™) + Z x(«) d-+ Z x(»)+ Z x(n) 

n=l n-6 n-4(jAn-3 

N 

= 04'0-f''''“l"04' Z x(^)» 


n-4|jAri+l 


and hence 


n-4IjA1+l 


N 

Z X(n) 


< 1 . 


The truth of the assertion is therefore a weak consequence of the 
following theorem, which is due to Abel. 

Theorem 6-2. // {a„l is a sequence of constants for which 

Z an = 0(1) 

n»l 


05 iV -> 00 and if {&„(«)! is a sequence of positive-valued functions 
which converges monotonicaUy and uniformly to zero for s in some 



6-1) 


INTRODUCTION 


205 


interval J, then the series 

Z <inK{s) 

n a 1 

converges uniformly for s in J. 


Proof: Put 

An = T. ajfc. 

k~i 

so that \An\ < A for some A and all n. Using the monotonicity of 
bn{s), we have 

= i: (An - An^l)bn{s) 

nmj 

= Z An{bn(s) — bn+l{s)) + “ Aj^ibj{s) 

n-> 

< Aibjis) - btis)) + AbtM + Abj(s) = 2.45j(s). 

By hypothesis, this upper bound can be made uniformly small, for 
s in J, by taking j sufficiently large. This proves the theorem. 


Y. dnbnis) 


n -1 


Here we have a situation which does not arise in the case of power 
series. For while a power series converges absolutely at every interior 
point of its interval of convergence, the Dirichlet series for L{s) 
converges for s > 0, but converges absolutely only for s > 1, since 
the series 


E 

n*l 


x{n) 


n 




1 


n-o + 1 


diverges. 

On account of the complete multiplicativity of x. we have 

(x(p))^ 




. 2 $ 


+ 


= 1 I x(p) xip^) . 
^ "T . -r ^2$ ' 


Using this idea, the proof of Theorem 6-1 can easily be modified to 
yield 


Theorem 6-3. 


If f is completely multiplicative, and Ike series 


m 

Z 

n*l 


n* 



206 dirichlet’s theorem 

converges absolutely for s > sq, then 


[chap. 6 



for S > Sq. 

Corollary. For s > 1, 

«,) - n (i - f )-'. 

We are finally in a position to prove Dirichlet’s theorem for primes 
of the form 4k + 1. Let s be greater than 1. We have 

ns) = n (1 - = d - 2-r^ n d - n ci - r-'^^ 

P fl r 

and, from the corollary to Theorem 6-3, 

L{s) = n (1 - n (1 + r-'r*. 

a r 

Hence 

{•(s)z,(s) = (1 - 2-")-’ n (1 - «-*)-“ n (1 - (5) 

« r 

Now, for s > 1, 


and so 

lim 

t-.i+ 


by (4). If there were only finitely many primes g, the expression 
on the right side of ( 6 ) would remain bounded as s—»!'*', since 
for s > 1 , 

n (1 - r-^T' < n (1 - r-^r' < n d - p-‘r' = rcz). 

r r P 

This contradiction shows not only that there are infinitely many 
primes q, but also that they occur sufficiently frequently that 

lim n (1 — q~*)~^ = *’■ 

#-♦ 1 * q 

The proof which has just been given contains most of the essential 
features of the general proof. The major formal difference which will 
arise in the general case is that we shall have to consider a number 
of functions like x above, and each will have an associated Dirichlet 



CHARACTERS 


207 


6 - 2 ) 


series, some aspects of whose behav’ior must be investigated. The 
most difficult part of the proof lies in showing that these series do 
not vanish at s = 1, a point which caused no trouble in the present 
case. 


PROBLEM 

Let == 1 or 0, according as the equation n = + y’ has or does not 
have a solution in integers x, y. It is known* that 6„ = 1 if and only if 
every prime r s — l (mod 4) which divides n occurs to an even power in 
the canonical factorization of n. Show that the series 



converges for « > 1, and diverges for « < 1. [Hint: Establish a relation 
among f(«), L(«), and the square of the given series.) 

6-2 Characters. We recall that the elements of a reduced residue 
system (mod k) form an abelian group under multiplication (mod k), 
which we designate by M(k). The number of elements of M{k), 
called its order, \sip{k) ; hereafter we shall use ^ as an abbreviation for 
•p{k). 

One of the fundamental theorems on finite abelian groups is that 
every such group has a basis: if it is a multiplicative group, this 
means that there is a set of elements Ai,. . ., Ar such that every 
element of the group can be written uniquely in the form 

where each x, is one of the integers 0, 1,. .., ord A.- - 1, and ord A. is 
the order of the cyclic subgroup generated by A;. Moreover, the 
product of all the numbers ord A* is the order of the group. The 
following theorem, for which we give a proof based on the theory of 
primitive rootsf is a special case. 

Theorem 6-4. (a) Lei k = p.-i • • . where p. p, and each 

of the prime powers pi^' has a primitive root, say gi. Then the numbers 


• See, for example. Volume I, Theorem 7-3. 
t See, for example, Volume I, Chapter 4. 



208 

A 


[chap. 6 


dirichlet’s theorem 

I,..., Ar form a basis for M(k) if, for each i, 



gi (mod 

1 (mod if j ^ I, I <j < r. 


(b) Let k — where a > 3, and let he a primitive 

root of pi°* for 2 < i < r. Then the numbers Aq, Ai, ... , Ar con¬ 
stitute a basis for M(k), where 


^0 

Ai 


f—1 (mod 2®) 
11 (mod p,®') 

'5 (mod 2®) 

,1 (mod p,®0 


andfor2 <i <r, 


for2<i<r, 
for 2 < i < r, 



gi (mod pf®*) 

1 (mod py®^) for j i, 1 <j < r. 


Proof: Let a be relatively prime to k. Then it is also prime to every 
divisor of k, so that there are unique elements a\,... ,ar of 
■^(Pi“0» • ■ • J •^(Pr"'"). respectively, such that 


a = a\ (mod pi®0» 


a = Or (mod Pr"’’)- 



Conversely, for any choice of oi,..., Cr in M(pi®0, • • •» Af(Pr“0> 
respectively, the system (6) has a solution a which is unique modulo k, 
by the Chinese Remainder Theorem, and a is prime to k. Moreover, 
if a is the solution of (6), and if, for 1 ^ i ^ r, 6,- is the solution of the 


system 


then 


[a,- (mod p,®*) 
' 1 (mod py®^) 


for j ^ i, 1 < i < r, 




1 • • • 1 • • 1 • • • 1 = Oi (mod p,®0, 


for 1 < t < r, 


so that 


a = bi ‘ "br (mod k). 



(Thus, in the language of group theory, M (k) is the direct product 
of .... MiPr”).) 



CHARACTERS 


209 


&-21 


Now if pi“* has a primitive root j?,-, and 


Ai 

then, since 

di 


Qi (mod p 

1 (modp/0 5^ h 

'(mod p.“0. 




we have that 

6- = (mod p/0, for 1 < j < r, 

and hence 

b.- = (mod fc). 

Thus by the congruence (8), if all p,“* have primitive roots, 

and this representation is unique if each index is given its smallest 
non-negative value, so that 0 < ind < ^(Pi“*)- 

On the other hand, if pi"» = 2“ with a > 3, then -1 and 5 con¬ 
stitute a basis for M (pi***)- For* 5 is a primitive X-root of 2“, so that 
the 2*”^ numbers 

5, 5^ . . ., 52-“ 

are distinct (mod 2®); since they are all congruent to 1 (mod 4), 
and since there are exactly 2®”^ numbers in a reduced residue sys¬ 
tem (mod2“) which are congruent to 1 (mod 4), these must be the 
numbers. Likewise, their negatives are all the numbers congruent 
to —1 (mod 4) in a reduced residue system (mod 2“).t Hence, if a 
is in M(2“), then, for some choice of xo and xi, 

a = ( —(mod2“). 

Thus if Ao ,..., Ar are defined as in part (b) of the theorem, we have 

a = Ao'oAi'iAg”'*'*^.. . (mod k), 

and the representation is again unique if we require that 

0 < Xo < ord Ao = 2, 

0 < xi < ord Ai = 2®”^, 

0 < ind a, < ord A, = v>(pi"0- 
* See Volume I, Theorem 4-9. 

t A similar argument is used in Volume I, in the proof of Theorem 5-1. 



210 


[chap. 6 


dirichlet’s theorem 
Notice that in the two cases we have 


ord Ai • • • ord Ar = = h, 

ord Ao • ord • • • ord Ar = 2 ■ 2“”^ ■ <p(p 2 '’^) • • • ^(pr"’’) = h. 

To obviate the distinction between cases (a) and (b), we rename the 
basis elements Bi,.. ., B„, and put ord 5,- = A,- for i = 1,..., m. 

A complex-valued function Xt defined over the group M(k) (more 
generally, over any finite abelian group), is called a character (mod k) 
(or a character of the group) if it is completely multiplicative and not 
identically zero, that is, if 

x(ah) = x(a)x(^)» for aod b in M{k), 

;^(a) 0, for some a in M(k). 

Since in the group we identify integers which are congruent 

(mod k), we have 

xCa) = x(a'). if a = a' (mod k) and (a, k) = {a', k) = 1, 

so that one could also think of characters as being defined over the 
residue classes themselves. Notice that necessarily x(l) = since 
for any a for which xC'^) ^ Of "’o have x(a) = x(a ■ 1) = x(fl)x(l)- 
Moreover, if a is in M{k) and ord a = I, then 


(x(a))* = x(a') = x(l) = 1- 

Since l\k, it follows that every value of every character is an Ath root 
of unity. 

On account of its complete multiplicativity, any character is totally 
determined when its value is specified for each basis element Bj. 
Thus the characters are contained in the set of all completely mul¬ 
tiplicative functions over Mik) for which 

0<^j< Ay, (9) 


for j = 1, . . • , But conversely, every such function is obviously 
a character, and different choices of the /3's lead to different characters. 
Thus there are A different characters, corresponding to the Ai • • • A,. 

different w-tuples (^i,..., ffm)- ^ . 

Two groups G and (?', with elements a, A,. .. and a , 6 ,.... are 

said to be isomorphic if it is possible to find a pairing of elements of G 

with elements of G', such that each element of G corresponds to 



211 


g_2^ CHARACTERS 

precisely one element of G', and conversely, and such that ii a ^ a' 
and b ^ b', then ab <-» a'b'. In this case the groups are abstractly 
identical, and any theorem concerning one group has an immediate 
analog for the other group. To construct such an isomorphism 
between two finite abelian groups, it suffices to find a one-to-one 
correspondence of basis elements such that corresponding elements 

have equal orders. For let the bases be Ci,..., and Ci .Cj , 

so named that ord Ci = ord C/, for i = 1, . -. . s- Then we can 
make a and a' correspond if 

a = • • • C/* and a* = ■ C, 

0 < a:, < ord C,-. 

For if also 

b = Ci-'i ■ • • and 6' = ■ • • C/*'-, 

then 

ab = a'b’ = C,''>+'■ • ■ • 

and 

(a6)' = • • • C/**+‘'* = a'6'. 


Moreover, this is a one-to-one correspondence, since the representa¬ 
tions by basis elements are unique for the ranges 0 < x, < ord Ci = 
ord C/, 1 < t < s. 

For the basis Bi,..., Bn» of M{k), define characters xi. • • •, Xm 
as follows: 


x.(By) 


g2»»/A^ 

1 


if i = M, 

if i 5*^ 1 ^ i ^ 



Then from the sentence containing equation (9), we see that every 
character can be represented uniquely in the form 


X = xi^» • • • x«^", O<0i<hi for i = 1,. . ., m, 


since this gives x(5,) as in (9). (We say that two characters are 
equal if they have the same value for every element of the group, and 
define the product of two characters as the function whose values 
are the products of the component values; this function is also a 
character, by the sentence following (9).) Under multiplication, the 
characters form a group X{k), having basis xi» • • •, XmJ since 
ord Xi = ordB,, the groups X{k) and M{k) are isomorphic. The 



212 


diiuchlet’s theorem 


(chap. 6 


unit element of X{k) is the character Xqj principal character, 
such that Xo(®) = 1 for every a in M{k). 

We summarize the chief results obtained so far. 


Theorem 6-5. There are h distinct characters (mod k), and these 
form a group X{k) which is isomorphic to Mik). Every value x(a) 
is an hth root of unity. The characters xi» • • •, x»n d^ned in (10) 
form a basis for X{k). 


We shall also need the following result. 
Theorem 6-6. If x is in X(k), then 


Z x(a) = 
aCim 


h 

0 


if X = Xo, 
if X 5^ xo, 


while if a is in M(k), then 


L x(a) = 

xCX(fc) 



if a s 1 (mod k), 
if a ^ 1 (mod k). 


Proof: We have 

L Xo(a) = L 1 = A. 

oGAf(*) 

\i X 7^ XO) then for some 3 in M{k), x(S) 7^ 1- For this 3, 

x(a) L x(a) = L x(a)x(3) = L x(a3), 

Q a ^ 


and, as o runs over a reduced residue system, so does a3, so that 

x(a) L x(a) = Z x(a), 

O a 


Lx(a) =0. 

a 

If a 1 (mod k), and a = then some i; 5^ 0. For 

this i, x»(®) ^ 

Xi(o) Z x(a) = Z x.(a)x(a) = Z x/(«)» 

X X 

where x/ = X.x- As x runs over X(k), so also does XiX = x/» s.nd 

Xiia) Zxia) = Z x(a), 

X X 


Z Xia) = 0. 



CHARACTERS 


213 


6-2) 


x(a) has so far been defined only for arguments relatively prime 
to k. For simplicity in later formulas, we define 

x(a) =0, if (a.fc) > 1. 


This does not affect the validity of Theorem 6-6. 

The duality of the relations of Theorem 6-6 is a reflection of the 
isomorphism of X{k) and M(k). In a sense, the reason for the 
importance of characters in the investigation of primes in progres¬ 
sions lies in the second relation, since it singles out the elements of 
a particular residue class (mod k), so that by use of the relation 

E ?(a) = 7 E ff(a) E x(a), 

u<a<9 ^ ti<a<e X 

ami (mod li) 

sums can be extended over an entire interval instead of a finite or 
infinite arithmetic progression kt I. Moreover, by a slight modi¬ 
fication, any other residue class can be distinguished in the same way. 


Theorem 6-7. If (a, k) 

xGX(*) X(b) 


(6, k) = 1, then 
h if a ^ b (mod k), 

0 otherwise. 


Proof: Choose c so that 6c = 1 (mod k). Then 

x(a) . 

E -777 = E x(ac), 

xCX(*) XVOJ x€X(*) 

and, by Theorem 6-6, the last sum is 6 or 0 according as ac is or is 
not congruent to 1 (mod k), that is, according as a is or is not con¬ 
gruent to 6 (mod k). 


It should be noticed that the function 



0 


for n odd, 
for n even. 


introduced in Section 6-1 is a character modulo 4. 
cipal character 


It and the prin 


xo(n) = 


1 

0 


for n odd, 
for n even. 


constitute the group X(4) of order v>(4) = 2. The correspondence 


Xo 1. X 3, 

describes the isomorphism between X (4) and M (4); each is the cyclic 
group of two elements. 



214 


dirichlet’s theorem 


(chap. 6 


6-3 The L-functions. For each character X) we define a function 
L{s, x) for s > 1 by the equation 


L{s,x) = 



f 


or equivalently (according to Theorem 6-3) by the equation 

n (. - )- ^ 

In particular, 

Li$, xo) = n (1 — = n (1 - p“*)i'(s), 

P+* V\k 

SO that, by equation (2), 

Lfo xo) = n (1 - p-') ■ 

v\k t Jl t / 

This latter representation for L(s, xo) is consistent with the series 
definition for s > 1, and may be taken as the definition for 0 < s < 1. 

For the proof of Dirichlet’s theorem, it is necessary to know some 
of the properties of these /^functions. All the relevant properties 
can be proved by elementary arguments, but the proofs frequently 
can be simplified considerably if use is made of the theory of func¬ 
tions of a complex variable. In these cases alternative proofs will 

be given. 

Theorem 6-8. L($, xo) is continuous for s > 1, and 

h 

lim (s — l)I'(s. Xo) = 7 • 


( 11 ) 


( 12 ) 


Proof: For s > so > li 

L(s, xo) = L ^ 

SO that the series for L(s, xo) converges uniformly in any interval to 
the right of s = 1- Since the separate terms are continuous, the sum 

is also continuous. Moreover, by (12), 


lim (s — l)^(s» xo) = n(l“P )~ ». 

For X ^ xo. Theorem 6-6 shows that for arbitrary no, 

no+k 

E x(.n) = 0 , 

f»»no 



NONELEMENTARY PROOF 


215 


6^1 


SO that by grouping the terms of 


P 


Z x(n) 


in blocks of k, with perhaps part of a block left over, we see that 


Z x(n) 


< h. 


It follows from Theorem 6-2 that the Diricblet series for L(s, x) is 
convergent for s > 0. We need a slightly stronger result, which is 
proved in the following theorem. 


Theorem 6-9. If x ^ Xo> fken L{s, x) has a continuous derivative 
{and is therefore itself continuous) for s > 0. 


Elementary proof: We use the standard theorem from analysis, 
that if the series resulting from termwise differentiation of a given 
series converges uniformly over an interval, then its sum is the 
derivative of the original series. The termwise derivative of 



«D 

z 

n^l 


x{n) 

n" 


^ x(n) logn 
_ ^ -, 


n *1 


n 



and for 0 < So :< s < si. the result follows from Theorem 6-2 by 
taking On = x(»i)i log n. But so may be arbitrarily 

small, and si arbitrarily large, so that every s > 0 can be included in 
an interval in which L{s, x) is continuously differentiable. 


Alternative proof: Applying Theorem 6-2 and the fact that 




z 

n*l 



« Z 

n 



where <r = Res, we see that the series for L{s, x) is uniformly con¬ 
vergent for Re s > <ro > 0. Since each term of the series is an 
analytic function of s, the sum is also analytic, and is therefore 
differentiable. 


6-4 Nonelementaiy proof of Dirichlet’s theorem. There is a proof 
of Dirichlet’s theorem which is remarkably simple and illuminating, 
and which fails to be elementary only in the sense that logarithms of 



216 


dirichlet’s theorem (chap. 6 

complex numbers are used. If the student who is not familiar with 
this extension will assume that the usual properties of logarithms of 
positive numbers (including the form of the Maclaurin expansion of 
log (1 -f x) for lx| < l) carry over to logarithms of nonzero complex 
numbers, he will find this proof much more straightforward than the 
elementary proof given in the following section, where use is made of 
the relation 

d , ,, , fix) 

— log fix) = 


dx 


fix) 


to avoid logarithms entirely. 

For s > 1, |x(p)/p*l < so that for such s we can describe a 
branch of the function log (l — xip)/p*) by the equation 


\ P* / m^im \ p‘ 

By (11), this induces the choice 

• xfp”) 

log L(s, x) = E L 


(P)J 


= - E 


X(P”) 


m 


1 mp 


m$ 


for s > 1. 


(14) 


p mp- 

Theorem 6-10. For each x, the function 

Fis, x) = logL(s, x) - 

p P 

is bounded in absolute value for s ^ 1. 

Proof: We rewrite (14) in the form 

logL(s,x) E p. 


(15) 


Here, 


1 


1 


^ ^ x(p"*) ^ V y* = -y - -^ 

^ ^2 mp-- ^ 2p- 2 t - P ') 


p m 


'^'^2?p"’(l -2"‘) 
(1 - 2 --)-' ^ 1 
«-^ 


(1 - 2 -*)“^ ^ 1 
2 p P^‘ 


(1 - 2--) 


J \—1 


f(25), 


and since r(2s) is bounded for 2s > 1 + ., the theorem foUows. 



6^1 


NONELEMENTARY PROOF 


217 


We can now complete the proof of Dirichlet’s theorem, except for 
one gap which will be considered later. 

Theorem 6-11 (DirichUt’s theorem). If {k, 1) = then there are 
infinitely many primes of the form kt + 1. 

Proof: Multiply equation (15) by l/x(0 and sum overall x in 
X{k). This gives 

logL(«, x) ^ V- x(p) I ^ x) 

? x(0 X P xiW t xil) 

^ 'T -I- V 

P P’ X xil) ^ X xil) 

and, by Theorem 6-7, 


^ log L(S,X) , X- 1 , y>^is>x) 
I--77^-= " ^ "7+^— TTT ’ 

X xil) pBlCroodib) P X xil) 



Let s —»1"^ in (16). The second term on the right remains bounded, 
by Theorem 6-10. We know that 


lim Lis, xo) = 

f-»i* 


so that 


lim 


log Ljs, Xo) 

xo(0 


00 . 


Suppose for the moment that it had been shown that the remaining 
functions Lis, x) (which we know to be continuous at s = 1) have 
nonzero values L(l, x) at s = 1. It would follow that 


^logL(s, x) 

l™ Z - Tj: — 

•-1* x»-xo xil) 


< 00 


and (16) would then imply that 


lim Z - 

•-♦l* p ■! (mod it) P 



an equation which is possible only if the sum has infinitely many 
terms. Thus when we show that L(l, x) 0 if x Xo. we shall 
have proved not only Dirichlet’s theorem but the stronger result that 
the series 


E 

p (mod ib) 


1 

p 


diverges. 



218 


dirichlet’s theorem 


(chap. 6 


6-5 Elementary proof of Dirichlefs theorem. It is possible to 
avoid the complex logarithm logL(s, x) by using its derivative 
instead: 


£logL(s.x) 


L'js, x) 
L(s,x) 


V 

1 (s, X). 


If we could use the relation (14), we could immediately deduce that 

x) ^ * x(p"*) logp 


since we cannot, we arrive at the same result by the rather more 
awkward method of dividing Z/^(s, x) by I/(s, x)- In the process, 
we shall have occasion to use some properties of the Mobius /i-func- 
tion, which is defined by the following relations: 



if n = 1, 

if n is divisible by a square larger than 1, 
if n is the product of r distinct primes. 


Alternatively, m is the multiplicative function (that is, fi(mn) = 
whenever (m, n) = l) such that 



if n = 1, 

if n = p, a prime, 

if n = p**, a > 1. 


The properties we shall need are these. * 

1 if n = 1, 

0 if n > 1. 


(a) L = 

d\n 


(b) If / is any number-theoretic function and 


then 


F(n) = L m, 

d\n 


f{n) = Z M(d)F (- 

din 


Theorem 6-12. 
scries 


If f is a completely multiplicative function, and the 


CO 


L 

n ”1 


m 

n‘ 


* See, for example, Volume I, Theorems 6-5 and 6-6. 



6-5) ELEMENTARY PROOF OP DIRICHLET’S THEOREM 

converges absolutely for s > So> Iben 


219 


C?/-5 


/(n)V‘ ^ £ /(»)>■(») 


/ 


n "1 


n 


for s > Sq. 

Proof: We have 


" fj^ " /(n)M(n) ^ “ /(ffln)M(n) 

„_i m* n’ (mn)* 

. E M(d) 

= E /O') = 1- 

i-1 J 

Theorem 6-13. For each x, the relation 

L' , , ^ x(n)A(n) 

7 - (S, x) = - E —— 

Li n*l ^ 

holds for s > 1, where 

log p if n = for some a > 0 and prime p, 


(17) 


A(n) = 


0 


olherunse. 


Proof: By the preceding theorem and the expression (13) for 
L'{s, x)t we have, for s > 1, 

L', ^ ^x{m)\ogm ^xUMj) 

m- / 

^ _ Y x(”tj)M(.;') logm 

{mjy 

. x(n) L M(d) log^ 

= -E-- -■ 


n*l 


n 


But from the obvious relation 


logn = E A(d) 

tf|n 

and the Mobius inversion formula quoted above, we have 

A(n) = E #*(<f) log^ ' 

d|n a 


and the theorem follows. 



220 


dirichlet’s theorem 


(chap. 6 


Theorem 6-14. For each x, the function 

Gfex) =^(s,x) + Ex(p)^ (18) 

L V V 

is bounded in absolute value for s > 1. 


Proof: Equation (17) may be rewritten in the form 


-t-'/ N v-x(p)logp V- x(p’”)logp 

— (s, x) = - L- \ -E Z 

L p V 


p m-2 


jn$ 


and 


logp 


^ ^ X(P"*) logP Y = r _ 

^ ^ ^ — 7r^\ 

p m*2 P p m»2 V p V V ) 


«V_ ‘°gp 

r p“*(i - 2-) 


and the last series clearly converges for s > J. 

We can now complete the proof of Dirichlet’s theorem in much 
the same way as before. Multiplying both sides of (18) by l/x{0» 
and summing over all x we obtain 


1 L' . ^ ^ x(p) logP , 1 

x(l) L ?? x(0 P“ ^xXd) 


G(s, x) 


= - h 


^ + L 4t G{s, x). 


p»{ (mod k) V 


xif) 


Now let s —> I'*'. The second term on the right remains bounded. 
Assuming again that L{\., x) 0 for x 5^ xoi the quantity l/L{s, x) 
is also bounded for s sufficiently close to 1, since L is continuous at 
s = 1. For X ^ xo> x) remains bounded, by Theorem 6-9. On 

the other hand, 




A(n) _ _ logp J_ 
(n.*)-l ^ P\k V 


= E 


pf*P 


= loff Lis, xo) + Fis, xo) 


by Theorem 6-10, and the quantity log L{s, xo) + F(s, xo) increases 
without bound as s —> 1^- It follows that 


lim E 

pat (mod ib) 





221 


5_6) PROOF THAT L(l, x) 5^ 0 

and the theorem is again proved, except for the verification of the 
fact that L(l. x) 0 for x 5^ xo- 

6-6 Proof that £<1, x) ^ ® 

Theorem 6-15. If x assumes a nonreal value for some n, then 
L(l.x) ^0. 

Proof: Let x be such a character, and let x be the function whose 
value for each a is the complex conjugate of that of x- Clearly x is 
also a character, and x x- But if L(l, x) = 0> then also 

L(l. x) = = 0, 

so at least two L-functions must vanish in this case. Since L(s, x) is 
differentiable at s = 1, the quantities 


L'(l, x) =bm--- 

.-»i « — 1 


and L '{ 1 , x) = 


L(s, x) 
s — 1 


exist, so that there is a number A such that 

n L(s, x) 

lim ,,, = A. 

(s - 1)^ 


Since 


we deduce that 


lim (s - l)L(s, xo) = 7 » 

k 


lim II L{s, x) = lim 


n L(s, x) 

is - 1){{S - l)L{s, xo))^ 


is-\) 


But by (14), 


= 0.7 . A = 0. 
k 


EiogZ,(»,x) = EZ E ~ 

X X P m~i mp”** 

. rx(P”) 

= i: L * 


p ttfi 


mt 


^ h 


1 


p.m mp 

p^ml (mod k) 


ms 


> 0 



222 

for s > 1, so that 


dirichlet’s theorem 


(chap. 6 


lim nZ/(s, x) > e° = 1. 

>~*l* X 

This contradiction establishes the theorem. 

It would not be easy to avoid the use of the complex logarithm in 
this proof, since the Dirichlet series for x) has very complicated 

coefficients. To obtain an elementary proof, it is simpler to use a 
different combination of L-functions. Unfortunately, the choice we 
make can hardly be motivated by an elementary argument, but must 
remain a deus ex machina until Section 7-3. It is the left side of the 
inequality 

i'foxo)|ifox)l"|ifex')l=*> 1; (19) 

this inequality we now show to be valid for s > 1. 

Note first that for z = r (cos 6 + i sin B), 

[1 — 2 ]^ = |1 — r COS0 — ir sin 6 \^ = \ — 2rcos + r^, 
and that for arbitrary real 6 , 

2 cos 0 + cos 20 = 2 cos 5 + 2 cos^0 — 1 = 2(cos0 + ~ -f > “f- 

Using the fact that the geometric mean of three positive numbers is 
at most equal to their arithmetic mean, we see that, if 'p\k and 

x(p) = cos Bp + i sin 

then 

yL_x!M^ 

/ I P’ 

= (1 - 2p“' cos Bp + p"^')^(l - 2p"' cos 20p + p“=*’) 

< (1 _ 2p--(2 cos Bp + cos 20p) + 

< (1 + p- + p-^‘f < (r:^.) ’ 

or 

(1 - xo(p)P"0®U - x(p)p"Tll - X^{P)P '1 < 1- 
This inequality also holds if V% and, multiplying over all p, we obtain 

(19)- 

It is now simple to prove that L(l, x) 9 ^ 0 if x^ 5 ^ Xq, that is, if x is 
nonreal. Supposing the opposite, and using the fact that L'(s, X) is 




g_0j PROOF THAT L(l, x) 0 

continuous at s = 1 , we have that for 1 < s < si, 


223 


\L{s, x)\ = \L{s, x) - x)\ = / L'{u,x)du < ^x(s - 1), 


where 


= max \L'{Sy x)l- 


But now (19) can be recast in the form 

o X 


\L{s, x^)l^> 1. 


in which the first factor tends to zero and the others remain bounded, 
as s —> I"*". This inequality is false for some s > 1, and the contradic¬ 
tion shows that L(l, x) 5 ^ 0 . 

No device of this sort has been found for the case that x(^) ‘s real 
for all n. Showing that L(l, x) ^ 0 for ^ real character is the most 
difficult point in the entire proof. Dirichlet effected it by showing 
that L(l, x) is a factor in the class number of a certain quadratic 
field. This and other algebraic proofs require a considerable amount 
of background; we shall content ourselves with an elementary and a 
function-theoretic proof. We first sketch the idea. 

If s > 1, then 

^ xW _ V -2^ - V 

f(8)Z/(S| x) S ^ ^ 

^ M ifJXTt] ImI t 


un-i (mn) 


1-1 


so that if we put 


m = E x(d). 

d\n 


then 


f(s)L(s, x) = L 


fin) 


n-l n 


( 20 ) 


( 21 ) 


for s > 1 . 

By Theorem 5-17, below, 


• ^ _!_ _ wo ^ 

r.-i n* (m^)* 

so that even if the series ^fin)nr* converges to the right of s = 5 , 
it is certainly not bounded near s = ^. In the analytic proof, we show 
that ( 21 ) is correct for s > 5 if L(l, x) = 0 , and obtain the con¬ 
tradiction 

lim Lis, x)i'(s) = L{^, x)f( 5 ) = «>• 

•-i* 



224 


dirichlet’s theorem 


(chap. 6 


In the elementary proof, questions of convergence are avoided by 
considering partial sums for s = | rather than the full series for s 
near 2 - It will be shown that 



2v^L(l,x) + 0 ( 1 ), 


and also that the sum on the left tends to infinity with x, so that the 
relation i(l, x) = 0 is impossible. 


Theorem 6-16. With f as in (20), 

f 

0 for all n, 


f(n) > 


1 


for square n. 

Proof: Being the arithmetic sum function of a multiplicative func¬ 
tion, /is itself multiplicative,* so that 

=/(p,»!).../(p^«r). 

Since x is a real character, x(p) = 0 or ±1 for each prime p, and 

/(P“) = E x(p^) 

fi-o 


= E (x(p)y = 

fl-o 


Hence 


1+0 + 

1 - 1 + 


+ 0 
+ 1 

+ (- 1 ) 


if x(p) = 0, 
if X(P) = 1, 
if x(p) = -1. 


0 


1 

a+ 1 


ifx(p) = 0, 

if x(p) = 1, 

ifx(p) = -1, a even, 

if x(p) “ “I» « odd, 


and the theorem follows. 


Theorem 6-17. The relations 


and 


E X(n) = 0(1) 

n 



hold as X—* fo, for s > 0. 


( 22 ) 

(23) 


• Cf. Volume I, Theorem 6-3. 



6-6) PROOF THAT L(l, x) 0 

Proof: We have already noticed that if 


225 


S(X) = L x(n), 

n 

then |5(x)| < h, which implies (22). Using this, we have 


A x(n) 


- S(n) - Sin - 1) 

E n- 


n-i n’ 


f (L _ 1 \ - 1) 

„?x \n‘ (n + l)'j i' 

^ , A /1 1 \ h 2h 

- „?x (n* ~ (n + 1 I’ “ 

which implies (23). 

Theorem 6-18. There is a constant C suck that 

£ 4= = 2\^ + C + o(^V 

n -1 vn \ vx/ 

Proof: Put 

1 

tn = 2 \/n — 2 Vn — 1 - — = I 

vn Jn-l 

so that 


dx 1 
Vx Vn 


»-2 Vn 


n-2 


Now in, being the area of the triangular region bounded by the curve 
y ~ ^nd the lines x = n — 1 and y = n~^, is positive and smaller 
than (n — 1 )“^ — n“^, so that the series 


converges, and 


E In 

n ■! 


T. k< Y. (— A=--^) = J- 

n-i +1 n-*+l\Vn— 1 Vn/ Vx 


Hence 


= i u-i+iu + o(^). 

n-1 Vn n .2 n-»+l „«2 \Vx/ 


This proves the theorem for integral x; its extension to real x i 
immediate. 


IS 



226 


dirichlet’s theorem 


[chap. 6 


Theorem 6-19. If x xo is 
a real character, then L(l, x) ^ 0. 

Proof: Put 

n=i Vn 

By Theorem 6-16, 

1 


Vx 

G{x) > E 

m 


— = L - 

-1 m=l ^ 


so that G{x) —> 00 with x. 
On the other hand, 


G(x) = Z ^ L X(<i) 
y=i vj d\j 



Figure 6-1 



■ 


This sum, extended over the lattice points u, v for which u > 1, 
V > 1, uy < a:, we split into two parts, as indicated in Fig. 6-1: 

U 


x(v) 

. ^ Z Z -7= 

1 »->/x+l WUV u = l VUV 


Vx \ */« 

= E ^ E ^ 

u=l VU e=Vx+l Vy 




1 


1 y/v u-i Vu 



^1 1 xfy) 

= E -pzO(x-^) + Y.^i‘2 

ii«»i vu Vw 

V 

= Oix-^) ■ 0(a:^) + 2Vx E —+ ^'■^^(1) + 


1 




0 ( 1 ) 


so that 




= 2VxZ/(l, x) + 0(1). 



Thus, if L(l, x) were zero, G{x) would remain bounded as 
which is not the case. 

A rather more straightforward (function-theoretic) proof can be 
obtained by extending (21), which we know to be valid for s > 1, to 



227 


6 - 6 ) PROOF THAT L(l, x) 5^ 0 227 

the range s > under the assumption that L(l, x) = 0. By an 
argument quite similar to, but slightly simpler than that which yielded 
(24), it can be shown that if L(l, x) = 0, then 

E /(n) = 0 {Vx). 

Theorem 6-2 implies that the series 


= i; 


fin)/n^ 


i n* 


= E 


0 ( 1 ) 


n 


converges for s > 5 . Now let uq be a real number greater than 5 , 
and let s be a complex number with Re s = a > ao- Then for 
v > n > 1 , we have 

f /(n)/n-o 

n HU n n"u ^ 

f f(m) /(m) 

= ^ m -1 

n ■ U 7* W 

• -1 n / 1 1 \ 


= •£ _ 

«-« m-l wi'o \n*^o 




/(m) 


so that 


-1 nro 


-1 m 0 


f 

n-« n* 


•-1 J J 

< A L - - 

„-u n*-'o in + : 


( 7 + D—o + ^ 1 " '*“'"’1 + 


- A E (s — ao) [ 

n-u 


n+1 


2 0 


+ Ay 


^ ^ Z —^ + Au-<—’o), 


where A is such that 


T 

»-i n'o 


< A, for all > 1. 



[chap. 6 


228 


dirichlet’s theorem 


It follows that the series 



converges to an analytic function in the half-plane (t > \. Since it 
coincides with I/(s, x)r(s) for ^ real and larger than 1, it represents 
an analytic continuation of L(s, x)f(5) fora > But it is unbounded 
near « = while L(s, x)f(s) is not. 



CHAPTER 7 


THE PRIME NUMBER THEOREM 


7-1 Introduction, 
texts* that if 


It is shown in elementary number theory 


lim 


x/log X 


exists, it must have the value 1 , and that there are positive constants 
c and c' such that for x >2, 


Trjx) 
x/\og X 



Neither of these results implies the other, of course; together they 
show that 


0 < lim inf 


(3J) 


x/log X 


< 1 < lim sup 


Jr(x) 
x/log X 


< « 


(Here, as always, ir(x) denotes the number of primes less than or 
equal to x.) Both results were obtained by Chebyshev in 1851 and 
1852 (in rather more precise form), but it was not until some forty- 
hve years later that the hnal link was supplied by Hadamard and 
de la Valine Poussin, who showed independently that the limit 
actually exists, and thus proved the Prime Number Theorem. Both 
proofs made essential use of the theory of functions of a complex 
variable, and despite much effort it seemed for many years impossible 
to give a proof entirely free of considerations as sophisticated as 
this theory. In 1948, however, P. Erdos and A. Selberg gave a com¬ 
pletely elementary proof. More precisely, Selberg proved the funda¬ 
mental relation 


L log* p + Y. log p log 9 = 2x log X -I- 0(x), 

pq<s 

and he and Erdos deduced the Prime Number Theorem from it.f 

• See, for example, Volume I, Sections 6-6 and 6-7. 
t Excellent expositions of this proof are given in T. Nagell, Introduction 
to Number Theory (New York: John Wiley & Sons, 1951) and in G. H. 
Hardy and E. M. Wright, An Introduction to the Theory of Numbers (3rd 
edition, New York: Oxford University Press, 1954). 

229 



230 


THE PRIME NUMBER THEOREM 


(chap. 7 


We present a proof based on the behavior of the f-function for 
complex s. Throughout this chapter, familiarity with the contents of 
a standard course in the theory of functions of a complex variable 
is presupposed. 

Before going into detail, we outline the proof. Our object is to get 
an estimate for 

^(x) = E 1 = E Pin), 

j><r n*l 


where P is the characteristic function of the primes: 



if n is prime, 
otherwise. 


While P itself does not arise in a natural way, the function P* such 


that 



P*in) = 


m 


0 


if n = p”* for some m, p, 
other\vise, 


occurs in the Dirichlet series for log f («)• 


1 • P*{n) 

logr(s) = E —= E 


^ ( 1 ) 
-p mp"' n* 

For fixed m, the number of mth powers of primes which do not exceed 
z is equal to the number of primes which do not exceed Vx, so that 

I \ ^ 


E P*in) = E Pin) + i Z Pi^) + ^ ^ + 

2 n-l *5 n-1 


» • • 


n ■! 


n 

. , , >r(-v/x) Tr[</x) 

= 7r(x) H-r-h - + 


2 3 

and since, for m > 2, 

<cVi = o 

it is to be expected that 


E P*in) Trix). 

n“l 


In light of (1), the present case is a specialization of the following 
problem: given a function 





7-11 

to estimate 


INTRODUCTION 


231 


E fln. 

n *1 


( 2 ) 


It will be shown that 

1 


2 + , {w 

as = 

0 


if u» > 0, 
if ly < 0, 


so that J{w) is closely related to the characteristic function of the 
positive real numbers. If we put 


w = log - I 
n 


this gives 

1 

(x/n)’ 


2in, 

L.i 

so that 


JL r'‘^. 

2in J 2 ^»i i 


ds = 


log (x/n) 


if n < 
if n > a:, 


— - 5 f{s)ds^ E a„log-. 

2 ?ny 2 -«» S n<, n 

If 5 = 8(x) tends monotonically to zero as x —» «, then 

^ , x(l + 5) _ .X 

E an log-E On log - 

n<*(l+«) n n<x n 


= log (1 -}- 5) E On + E an log 

n<x X <n<x+ix 


= log (I -b 6) 


Yi an 0 ( 

n^x \ 


log (1 + 5) 


x(l +g) 
n 

E 

»<n<x+4x / 


If the remainder term here is of smaller order of magnitude than the 
first term for suitable choice of 3, then 

^ 1 r+"*'x'((i+5)* - 1 ) 


and the problem reduces to that of obtaining an adequate estimate 
of this integral. To do this, we replace the line of integration by a 
suitable large closed contour, inside and on which we have sufficient 
information about/(s) to apply standard contour-integral techniques. 

In the case at hand, the estimation of the integral in the last rela¬ 
tion requires some knowledge of the zeros, poles, and size of i-(s). 



232 


THE PRIME NUMBER THEOREM 


[chap. 7 


7-2 Preliminary results. Following the odd but harmless tradi¬ 
tion in analytic number theory, we designate by <r and t the real and 
imaginary parts of the complex variable s. For x > 0, means 
g»iog»^ where log x indicates the real logarithm. 

When we have proved the Prime Number Theorem, we shall con¬ 
sider some other rather similar problems, and for one of these it wll 
be necessary to use not the Riemann f-function but the so-called 
Hurwitz ^-function, defined for 0 < < 1 and o- > 1 by the equation 


r(s, w) = L 

n»0 


1 

(n + u))* 


Since i’(s, 1) = ^(s), and since the requisite properties are no more 
difficult to prove for f (s, w) than for i’(s), we consider the more 
general function. 


Theorem 7-1. For any itq > 1, the series 


E (n + u>) * 

n *0 


converges uniformly for tr > <ro, so that ^(s, w) is regular (or analytic) 
for <r > 1. 

Proof: We have 

+ w)-\ = = (n + to)"', 


so that for <r > <ro, 

£ (n -f lo)-* « £ (n + w)-'o. 

n=0 «“0 

Thus we have a series of analytic functions which is do^nated 
throughout the region a > by a convergent series of positive con¬ 
stants, and which is therefore uniformly convergent, and the result 
follows from Weierstrass’ theorem. 

Theorem 7-2. If a and b are integers with 6 > a > 0, and if f 
has a continuous derivative over a <x <b, then 

Y. /(«) = / /(“) ‘^“ + / “ N)/'(w) 

n-d+l •''* 



7-2] 


PRELIMINARY RESULTS 


233 


Proof: We have 


[ uf{u) du = nf{n) — (n — l)/(n — 1) — / /(“) 

A-l 


= fin) + (n 


- dT /' 

Jn-l 


(u) dw 


Jn-l 


f{u) du 


= /(n) + [ lwl/'(“) du ~ f f{u) du, 

Jn-l Jn-l 


from which the result follows by summing on n from o + 1 to 6. 

Theorem 7-3. If m is a non-negative integer, and <r > 1, then 

1 J!! 1 /*• u-[u] 




n?o (n+uj) 


1_r_u 

dm («- 




du. (3) 


It follows that f(s, lo) — 1/(5 — 1) is regular for <r > 0, and that 
(3) holds for <r > 0. 


Proof : If <r > 1 and 


fiu) = 


1 


(u + wY 


then the equation of Theorem 7-2 continues to hold if 6 —» », and, 
replacing a by m, we have 


1 


1 


n-m+l in + u>)* is - l)(m + u>) 
from which (3) follows. Since 


—V 


- [u] , 


u - [«1 

(u + 


1 1 
^ (u + ’ 


the integral on the right side of (3) converges absolutely for a > 0, 
and uniformly for (t > (tq > 0. For arbitrary n > 0, the quantity 

Jn iu- 


du 




n+l 


u ^ n 


du 


iu -1 - (u + 

is a regular function of s for <r > 0; the same is true of 


£ « 

•m Jn \U 


-Iu] 


(u + «>)•+* 


du 


=f‘ - 

Jm iu 


-(U) 


(u + 


du, 



234 


THE PRIME NUMBER THEOREM 


(chap. 7 


for m > 0. Finally, taking m = 0 in (3), we have 


Us 


•I 


t-Ju] 


and the right side is regular for tr > 0 . 


Equation (3) thus provides an analytic continuation of Us, w) 
over the half-plane o- > 0. The function is actually analytic over the 
entire plane, except for the pole at s = 1 , but this fact is not needed. 

Hereafter c wll denote a positive constant which depends only on 
the arguments indicated; it need not have the same value in different 
occurrences, unless it has a subscript. 

Theorem 7H1. For ^ < <r < 2 and I > c(w), 

lr(s, uj)| < 

For / > 8 and 1 — (log ()”* ^ fr < 2, 


|f(s, u))| < c(u>) log^. 

Proof: For \ < o <2 and i > 3 we have |s| < 2 + ( < and 
s — 1| > / > 1. Hence if we take m = [i] -|- 1 in (3), we have 

du 

-7 -h — + i:- + 2u - 

(Id + 1 + w)’-' ^ w’ 

1 “ 1 , 2 / 

([/] + 1 + ^ ’ 


1 1 “’+1 1 f 

!!•(».«-)! < . . + + H 


.H-1 


or 




1 


{[t] + l+ w) 
Thus, for this same range of a and 


1 

—j + c(«7) + E — + 4( '■ 


(4) 


n-l n 




1 1 /• 
-i + c{w) + E ” 7 = + 4Vc 

1 _|_ 14,)- f „=i vn 


{[1] -\-l + w) 


L 


and this is smaller than for t > c{v}). * -f 

Now take t>S>e^ Then 1 - (log 0"* > h so that if 

1 _ (log 0"^ < O' < 2, the inequality (4) gives 



7-21 


PRELIMINARY RESULTS 

[<1 j^lAogt 

|!-(s, «;)] < {20'^°'' + c{w) + Z 

n ■*! ^ 

. i!I 1 


235 


+ 4i 


1/log t 


< 2^e 4- c(to) + c H - -h 4e < c(w) log L 

n-l n 

Theorem 7-5. If, for |x| < 1, 

/(i) = L a„x^ 

n 

is regular and Re f{x) < 5 , then [an| < 1 for n > 1. 

Proof: Since |/(x)| < |1 — /(x)l for lx] < 1, the function 

fix) aix H- .12. 

^-77T = ^-= aix + b 2 X^ H- 

1 — fix) 1 — Oil — • • • 

is regular and has modulus at most 1 for |x| < 1. But the function 

// X _ /(^) 

x(l-/(x)) 

is also regular for |x| < I, and its value at x = 0 is oi; by the maxi- 
mum-modulus principle, its absolute value is at least as large at some 
point on [xl = 1. Since for |x| = 1, 

li. / /(^) 

■ 

it follows that 

l“il < 1. (5) 

The theorem will therefore be proved if we show that each of the 
functions 

Fnix) = a„x + a 2 nX^ + . . . 


fi ix) = 


1/1 (X) I = 


fulfills the same hypotheses as fix) itself. This depends on the fact 
that if I? = then 


n—1 

.to’ (/" - l)/(o‘- 1) = 0 


if n\k, 
if njfc. 


n—I 


L fiv^x) 
{-0 


V i a*n*'x* = £ V 

l-O *-1 A-l 


n £ a*x*' = nF„(x"), 

n|k 


We have 



{chap. 7 


236 


THE PRIME NUMBER THEOREM 


SO that Fnix) is regular for \x\ < 1, and for such x, 


1 1 1 1 

Ren(x") = - E Re/(,^x) < - E ^ ^ 

n. 2-0 ^ i=o 2 2 

Theorem 7-6. Le^ R he ’positive^ and suppose that 

Six) = S On(x - Xo)" 

n»»0 

w regular and Re /(x) < 3//(w |x — xo| < R. Then, for n > 1, 

|o.| < |; (M - Re Oo). 

Proof: If Re Oo = Af, then a„ = 0 for n > 1, by the maximum- 
modulus principle. 

If Re Oo < M, put 

. . _ /(xq -f Rx) - ap 
2(Af-Reao) 

Then is regular for |xl < 1, ^(0) = 0, and 

Re /(xo + Rx) — Reoo^ Af — Reap 1 
" 2(A/ - Re Oo) “ 2(M - Re oo) 2 * 

Hence g satisfies the hypotheses of Theorem 7-5, so that 

-- < 1 ^ 

2(M — Re Oo) 

and the theorem follows. 

Theorem 7-7. If f satisfies the hypotheses of Theorem 7-6, and 
0 < r < R, then for \x — xo| < r, 

2r 


\fix)\ < |ao| + 


R-r 


i\M\ + lool) 


and 


l/'(x)| < 


2R 


(R-r) 


(m + lool). 


Proof: We have 


n 


|/(x)| <\ao\+ Z KK < l“ol + 2(|M| + (/j. 

n *1 


= looI + 


2r 


R-r 


i\M\ + lool). 




PRELIMINARY RESULTS 


237 


and 


go 


l/'(x)l < L lanlnr"-^ < 

n “1 


2(|M1 + |aol) 


R 


- L ^ 

ol 





Theorem 7-8. Let r be positive and M real, and suppose that 
/(So) ^ 0 and that Jor\s - So\ < T,f{s) is regular and 


fis) 

Hso) 



Suppose also that f{s) 0 in the semicircular region \s — so| < r, 
Re s > Re s©. Then 

/' AM 

-Re^-(so)<—- 


and if there is a zero p of f on the open line segment between Sq 
and So, then 

/' , 4M 1 

- Re^7 So <- 

f r So — P 


-r/2 


Proof: There is clearly no loss in generality in supposing that 
/(so) = 1 and So = 0. In this case, the hypotheses can be listed as 
follows. 

(1) For \s\ < r, f{s) is regular and |/(5)1 < e^, where M > 0. 

(2) m = 1. 

(3) /(s) 7^ Oforlsl < r, <r > 0. 

We look for an upper bound for — Re/'(0). 

If p runs through the zeros of / in the circle jsj < r/2, then the 
function 



/( 3 ) 

n (1 - s/p) 




is regular for jsl < r. 


On the circle |s| = r, we have 



- 1 > 1 , 


so that here |j/(s)l < \f(s)\ < e^. By the maximum-modulus 
principle, 



238 


THE PRIME NUMBER THEOREM 


(chap. 7 


|^( 6)1 < 6 ^, 


for s < - • 

I I _ 2 


Since g(s) ^ 0 for |s| < r/2, and i?(0) = 1, we can write 


ff(s) = for |s| < 2' 

where G is regular and Re (?(s) < iVf, G(0) = 0. By Theorem 7-6, 
with r/2 instead of /?, 

IG'CO)! < —M = 


But 


r/2 


q ' .. ^ -1/p . , V 1 

- (s) = - (5) - z 1-r = 7 (®) + ^ —: 

g f pi -s/p/ pp-s 


so that 


/'(O) + L - 
P P 


-( 0 ) 


= |G'(0)| < 


4M 


-Re/'(0) - X ^ ^ ^ 

« Rep r 


4Af 1 

-Re/'(0) < —+ ZRe- 


Since we have supposed that all zeros p have nonpositive real parts, 
the theorem follows. 

If / is regular on the vertical line <ro + ti, and if 

lim / /(s) ds = lim / /(<ro + ti)i di 

-*• Jao-ai •'-« 


6*^ « 


IP 


exists, then we abbreviate this limit to 


^ 0 ) 


/(s) ds. 


Theorem 7-9. have 


2in 7(2) s 


0 for 0 < y < 1, 

logy fory>l. 



7-2] 


PRELIMINARY RESULTS 


239 


Proof: The integral converges, 
because 

4 + 

First suppose that 0 < y < 1. 
Then in the region bounded by 
Cl and C 2 (see Fig. 7-1), the in¬ 
tegrand is regular, so that by 
Cauchy's theorem, 





But along C2, which is of length xa, we have 


so that 




xa X 



Hence, as a —» <», 




and the result follows. 

Now suppose that y > 1, and that a > 2. Then the pole s = 0 
of the integrand lies in the region bounded by Ci and C3, and since 

/ ^ 1 + s log y + (s^ \og^ y)/2 H- 1 log y 

s* 5“ “ ^ ~ ’ 


we have by the residue theorem that 


1 /* y» 1 C y* 

«• X. 7 *+X. *'• 


2W^c, s'* ’ ■ 27rt^c, 

But along C3, which is of length xa, we have 


y 


y 


2< 


(a - 2)2 ’ 



[chap. 7 


240 THE PRIME NUMBER THEOREM 

SO that, for a > 4, 



< 


Tray 


(a-2) 


< 


47rt/' 


a 


Hence, as a—> <», 



and the result follows. 




7-3 The Prime Number Theorem. It will be necessary in what 
follows to know something about the location of the zeros of the f- 
function. For large, this information is supplied by Theorem 7-13; 
for |i| small, we use only the fact that I'(s) does not vanish for <r > 1. 
Historically, this was the first nontrivial result obtained concerning 
the zeros of the f-function. (A trivial fact is that i'(s) 0 for u > 1, 

which follows immediately from the product representation 

r(s) = n (1 - p-)-‘. 

P 


valid for <r > 1.) The proof below that ^*(1 + H) 5^ 0 is due to de la 
Valine Poussin; it may have been suggested by the following consid¬ 
erations. 

For a > 1 we have 

log f(s) = Z ^ = L +/(s), 

m,p P r 

and / is easily seen to be regular for o- > Since f has a pole at 
s = 1, with residue 1, it follows that as a 1'*', 



We now reason heuristically. If 1 + ioi is a zero of f, and we put 
s = <r H- then as tr —»1'*’, 

log |f(s)l log C®" ” 


and 

Re log {'(s) — Re/(s) = log |f(s)| Ho/C®) 

^cos(<ologp) 

= £-;-- log (<r - 1). 


Comparing this with (6) we see that for most p, cos (<» log p) must 



7-3) THE PRIME NUMBER THEOREM 241 

be close to -1. But then cos {2to log p) must usually be nearly 1, and 

cos (2<o log p) 1 


E 

p 


p 


log 


<r — 1 


But this requires that f have a pole at 1 + 2toi, which is not the case. 
To make this argument rigorous, note that for all real 6, 

3 + 4 cos d + cos 2d = 2(1 + cos 6f > 0. 

Hence for <r > 1, 

loglf^(Oi-Vo- + ^oOi'(<^ + 2ioOl 

= 3 log |j‘(o')| + 4 log IfCa + ioi)l + log |f(ff + 2(ot)| 

„ ^ 1 . , cos (/on log p) , cos (2/on log p) 

= 3 L +-4 E +E 

n.P 


n.p • ■ Zp ’ n.p 


= E 

n,p 
> 0 . 


3 + 4 cos (/on log p) + cos (2/on log p) 


np 




Thus 


(W - Di-W)^ 


i'(q' + /pt) 

<T — 1 


lf(a + 2/ot)| > 


1 


- 1 


and if 1 + /ot were a zero of f, the left side in this inequality would 
remain bounded as <r ^ while the right side increases without 
limit. 

We now use this technique, together with Theorem 7-8, to show that 
f(8) does not vanish at any point too close to the line <r = 1 and 
sufficiently far from the real axis. 

Theorem 7-10. For a > 1, 

Re ^-3 ^ (<r) - 4 ^ (ff + ti) - j (<r + 2ti)^ > 0. 

Proof: Differentiating the relation 


logf(s) = E 


1 


we obtain 


fnj> 


mp 




f' , , .c-logp A(n) 


( 7 ) 



242 

where 


THE PRIME NUMBER THEOREM 


[chap. 7 


A(n) = 


log p if n — p”, for any m > 0 and prime p, 
0 otherwise. 


The termwise differentiation is justified because the series for log i'(s) 
converges uniformly in any region to the right of = 1. Hence 

Re (- 3 y (cr) - 4 ^ (a + U) -j(cr + 2ti)^ 


= Re L 

n 


(3 + 4n-'* + n-2‘*)A(7i) 


n' 


= L 

n »1 

> 0 . 


(3 + 4 cos (t log n) + cos (2t log ra))A(n) 


n' 


Theorem 7-11. (a) For o- > § and t > c, we have |f(s)| < t. 

(b) For t > 8 and a > 1 — (log have (s)l < c log i. 

Proof: For <r > 2 and t > 8, 


1 


i, 

log t. 


If(s)l < E ^ < 2 < 

n=I ^ 

For cr < 2, both inequalities of the theorem follow from Theorem 
7-4. 


Theorem 7-12. Foro-> 1, 

1 


_ f M(n) 


i'(s) „_i n 

where p is the Mobius function. 

Proof: This follows immediately from Theorem 6-12 for s real 
and greater than 1; by analytic continuation, it is correct for <r > 1. 

Theorem 7-13. There are constants Ci > 8 and cg > 0 such that 
r(s) 7^ Ofor 

C2 


t> C\ and 


> 1 - 


\ogt 


Proof: In accordance with Theorem 7-11 (a), choose ca > 8 such 
that 

|f(s)l < <, for O'> 5 , t>C 3 . Co; 



7-31 


THE PRIME NUMBER THEOREM 


243 



Inasmuch as 


? + fori> 

4 log X log X 

it suffices to show that any zero /3 + yi of f with y sufficiently large 
(in particular, larger than 8 ) and for which 


is such that 

Put 


/3>^ + r^ 

4 log y 


0<l- 


c-> 


log 7 


<ro = <ro(7) = 14- 


C4 


log 7 


and suppose that ^ 4 - 7 * is a zero of f for which y > and 

^ > <ro — J. We shall apply Theorem 7-8, once with so = <^o 4- yi 
and once with so = <^o + 27 t. In either case, since ao > 1, we have 
that for 7 > C 3 4- 5 , the circle \s — so| < § lies in the quadrant 
<r>i,t>C 3 . Since 7 > e\ we have <ro < 2, and, by Theorem 7-12, 


1 


f(so) 


• 1 

^ 2:4„ < 1 + 

n-l n'« 


I 


du 


, . 1 2 2 , 

= 1 4- ^ < -rr = - log 7 . 

<ro <ro C4 


Thus for each ei > 0 there is a C 5 such that for 7 > C 5 > C 3 4- the 
inequality 

^ ~ ( 2-7 4- log 7 < 7 *+‘> 


r(5o) 




244 


THE PRIME NUMBER THEOREM 


(chap. 7 


holds at every point« of the circular disk \s — so\ < since at every 
such point, C 3 < i < 27 + If 7 > C 5 , we can now apply Theorem 
7-8, with r = /(s) = ^(s), M = {\ -\- ci) log 7 . Using the first 

inequality of that theorem with sq = cq + 271 , we obtain 

T (<^o + 272 ) < 8(1 + € 1 ) log 7 ; (9) 


using the second with sq = <ro + yi we have 

1 

-Re - (<ro + yi) < 8(1 + «i) log 7 -r» 

S <ro — p 


( 10 ) 


since 


ffQ — r/2 = cq “ 4 < ^ ^ 1 < ^0* 

Finally, since ao ^ I"*" as < —» «, we have from ( 6 ) that for €2 > 0, 


f / ^ , 1 + <2 1 + «2 , 

— T (<^o) <-7 =-log 7 

f ao — 1 C4 


( 11 ) 


for y > Cf Using the estimates (9), (10), and (11) in Theorem 
7-10 gives 

3(l+ J^log^ + 4.8(l+ti)logT - ^ + 8(1 + .,) log 7 > 0 . 

C4 ” P 

This inequality can easily be simplified to 

C 7 


<ro — P > 


logy 


where 


C 7 * 


4C4 


3(1 4-ez)-h 40(1 + «i)c4 


and this gives 


/? < 1 - 


C 7 - C 4 

logy 


It is clear that c? > C 4 if f and C 4 is suflSciently small, and we can 
then take C 2 = C 7 — C 4 and Ci = max (cs, Ca). 

Theorem 7-14. // 0 < cs < C2> 

|log f (s)| < log* 1 for t > C, and a > 1 - 



THE PRIME NUMBER THEOREM 


245 


7-31 


Proof: We use Theorem 7-7, with sq = 2 + toi, for some > 8 to 
be determined. For t sufficiently large, the circular region 


|s — soi ^ 


^(C2 + Cs) 
log to 



lies entirely in the region described in the preceding theorem, in which 
f has no zeros. Hence the function log f (s) is regular in this disk, and 
by Theorem 7-11 (b), 

Re log r(s) = log If (s)l < log (c log 0 

< log (c log (/o + 2)) < Cio log log < 0 - 


Hence, by Theorem 7-7, we have that for s in the region (12), 

2 • 2(cio log log (o + If (so)l) 


|logf(s)l<lf(so)l + 


Cg - C2 


2 log to 
< c + C log io log log <0 < log^ to, 


if to is sufficiently large. This inequality holds on the radius extending 
toward the left from sq. for every large <o» and hence throughout a 
region ( > c^, 1 — Cs (logi)”^ < <2. Finally, lf(s)l and ll/f(s)| 

are bounded in the half-plane <r > 2, and |log f (s)l is consequently 
smaller than log^ t for t large and er > 2. 


Theorem 7-15. There is a constant a > 0 such that as x—* 

E log - = f* I* ds + 

P^Z V Jc r 

for some c with 0 < c < 1. 


Proof: Using Theorem 7-9, we have 


i /„ p -i - s /. p E ^ ©■ ^ 

_ 1 £ A(n) f 
2ti „_i logn j/j 


(x/n)* , A(n), X ^ 1 , x 

—— as - zl \ -log - = 52 “ log — 

( 2 ) s »i»iogn n m.p m p"* 

p"£* 


= E *og^ + E ^log4- 

P m*p TH V 


m>p TTl 
m>2 



246 


THE PRIME NUMBER THEOREM 



[chap. 7 


As noted earlier, the number of terms in the last sum is 

t(x^) + 7r{x^) + • - • < + X* + • • ■ + < ux*, 


where u is the smallest integer such that < 2. Thus 

t(x^) + t(x*) H-= O(x^Iogx), 

so that 

2W Z log - = f 3 log !•(«) ds + 0(«-'^), 
p<* V Jm sr 

since 


-log—< Z logi = 0(V5log^x) = 0(I« 

m>2 m p”* fn>2 

P"<* 



We now cut the complex plane along the real axis, extending the 
cut from s = 1 to the left, and examine the function logf(s) in the 
cut plane. If z is the complex c onjugat e of z, then ^(5) = r(«) and 
log 2 = so that logr(S) = logf(s). Hence, by Theorem 7-13, 
fCs) 7 ^ 0 for li| > C 9 > Cl and a > 1 - cg Gog [(j) . Moreover, 
since r(s) does not vanish on the line cr = 1. and since its zeros have no 




THE PRIME NUMBER THEOREM 


247 


7-3) 


finite limit point in any half-plane a > <to > 0 (since ^(s) is regular 
there), there is a constant cn > 0 such that ^"( 5 ) 5 ^ 0 in the rectangle 

1 - Cll < (7 < 1, 1(1 < C9. 


Finally, f(s) 0 for 1 < <r < 2, and the only singularity of the 
function in the half-plane c > 0 is at s = 1. Consequently, for 
arbitrary u > C 9 , log r(s) is a single-valued analytic function in the 
region Q shown in Fig. 7-3, bounded by the arcs Fi, r 2 , . . . , F^, F 7 , 
Pe, . . . , r 2 , Ti. Hence if we denote by F the complete boundary of 
this region (so that we might write symbolically F = Fi -f- F 2 + • • • 
-f- fi), we have by Cauchy’s theorem that 

f ^ log f (s) ds = 0 . 

Jr s 


It follows that, if the integrals are taken in the positive direction, 
f I* 

/ -2logf(s)ds 
^(2) S 


0 r2-vi r r /•2+«o»\ 

+ / + / + / ) 

2 --» Jti Jvi J2+ui / S 


-(r-f 

\J 2 -^i Jrt + — 


+ 


+r«+rT+r»+--+ri J2+ui 


ry, 

J2+ui / S 


logf(s) ds. 


We shall show that all the other integrals are small in comparison 
with those along Fe and fe, if u is sufficiently large. For brevity, put 


= -5logi’(s). 


By Theorem 7-14, we have that for u > 

‘2 + -» 1 /•2+-I ^ 

— |log{'(s)l|ds| 


r2 + . 

J2-hti 


r2+- 
^(x, s) ds < I 

Js+ui 


hi 

Jv ( Ju 


80 that 


dt cx* 
2 ^ <^’ 


r2 + mi 

lim / 4f{x, s) ds = 0. 

u-4 • J2+ui 

The same estimate applies to the integral from 2 — ooi to 2 — ut 



248 


THE PRIME NUMBER THEOREM 


Since the length of r 2 is less than 2, and the integrand i 
smaller than log^ for u large, we have 


(chap. 7 
is again 


lim 

eo 


i 

and similar considerations give 

lim f , 

ti—yr* 


s) ds = 0, 


«) ds = 0. 


Along Ts we have $= I - cs (log 0"^ 4* ti, so that 

-. 1 — 


j ^(x, s)<h < f 

JPi Jet 


\og^ t 


Cs 

t log^ t 




dt. 


Now suppose that x, and then u, are chosen large enough that 

C < < u. 


Then 


L 


^(x, s) ds 


= 0 



W2ct \og X 




-I log^ t 


L 


= 0 * 
= 0(X€-“'^), 


L 


di 4" 


yr 


/: 


dt 


where a = V 0 ^/ 2 . 
By symmetry, 


L 


^(x, s) ds = 0 (xe 


—avOog X 


). 


The paths r 4 , Ts, T 4 , and Tg are of fixed lengths, and on them 

^P(x, s) = 0(x'-‘=») = o(xc‘'*'^), 
so that the same estimate holds for the integrals themselves. 



7-3) THE PRIME NUMBER THEOREM 249 

Fy is described by the relations s = 1 + | 6 l ^ tt, where 5 > 0 . 

Since (s — l)f(s) —>■ 1 as s ^ 1 , we have 

Re log fCs) = log ii'(s)[ ~ — log Is — ![ = — log 6 , 

Imlogf(s) = argf(s) = 0(1) 

as 8 —»O"^. Hence 



^(x, s) ds = 0 



(1 - 




Combining all these results, we can take the limit as u —» « and 
5 —»0"^ and obtain 


2 iri H log - = f ^(x, s) ds + f ^(x, s) ds + o(xe 
p< r P Jl^cu Ji 

where the first integral is along the upper edge, and the second along 
the lower edge, of the cut. We know that (1 — s)f(s) = /2(s) is 
regular in the region o- > 0 , and that it has no zeros in the region 
> 1 — Cii, 1^1 < C 9 . Hence the function 

log ((s - l)r(s)) = log (s - 1 ) + logf(s) 


is single-valued in this region; since log (s — 1 ) has, on the upper 
and lower edges of the cut, values which differ by 27n, the same is 
true of logf(s), if the difference is taken in reverse order. Hence 
if s'*" indicates the upper edge of the cut, and s“ the lower edge, then 

f ^(x, s"^) ds+-f-/" ^p{x,s~)ds~ 

Jl-cu Jl 

= / -=ds, 


and 


X X* — 

E log - = / -2 ds + 0(xe““'^‘^**). 

P A-cuS 


(13) 


The theorem is proved. 



THE PRIME NUMBER THEOREM 

Theorem 7-16. As x—^ «, 

= f ~ + 0(xe"i"'^) 

J2 logw ' ' 

Proof: Replace 1 - cn by C in (13), and put 

5 = Six) = 

Then since log (1 + 5) 5 as x —> 00 ^ we have 

v' ^ , a: 

^ log-L log- 

P P<x P 


(chap. 7 


P< xd +3) 


xil + 5) 


= L log (1 + 5) -h L log 

x<p<*(l+3) P 

= log (1 + 5)7r(x) + O(log (1 + 5) • 5x} 

= [ (a + sy - 1 ) ds + oixe-^n, 

Jc 5 


so that 


7r(x) = 


Now 


1 


log (1 + 6 ) 


£ 


££±^,’ds+om + o'^‘ 


—oVIog X 


(1 + iy - l = sS + (1 + 

Z ! 

where 0 < t? < 1, so that for 0 < s < 1, 


1(1+5)'-l-s5|< 


^ {2 < j". 


Thus, making the change of variable x' = w, we obtain 

1 


I 


(1 + 5)' - 


^ - -x‘ds = sj + 7^) 

^sT-^ + ofs^f x^ds) 

J^C log u \ Jc / 


r du 

J2 lOgM 


+ OiS^x). 



7-3) 

Finally, 


THE PMME NUMBER THEOREM 


251 


ix) = 


log (1 + 5) 


—aVlog X 


The Prime Number Theorem is a very weak consequence of Theorem 
7-16, since 

r* du _ “ T _|_ ^ 

J2 log u log uJz J2 log^ U log X 


and 


—cvlog X 


= o( 


log X, 


for every c > 0. In fact, we see by repeated integration by parts 
that the relation 


f \ ^ a. . 

^ ^ log X log^ X log^ X 


mix 




holds for every positive integer m. 

The coefficient a occurring in the remainder term in Theorem 7-16 
can easily be bounded explicitly; it can be shown for example that 
« = Is an allowable value, by choosing C 2 = Tinnr* ^4 = 
cg = i 0^0 2 »“ ■^1 <2 = T^- However, no result of this type is as 
good as the known result that 


t(x) 


4* 0{xe 


—cvToixloiToix 


In a variant of the proof given here, the factor log i-(s) in the 
integrand is replaced by r'CsI/fCs). The logarithmic singularity at 
s = 1 is then replaced by a simple pole, which makes the analysis 
somewhat less complicated. On the other hand this gives an esti¬ 
mate not of 


^{x) = L 1, 

P<» 


but of 


^(x) 


= z log p r 

p< X L 


log X 
log p. 


and an additional step is needed to obtain the final result. 



252 


THE PRIME NUMBER THEOREM 


(chap. 7 


7-4 Extension to primes in an arithmetic progression. For rela¬ 
tively prime integers k and I, let ir(x; fc, 1 ) be the number of primes 
j) = I (mod k) which do not exceed x. For given k^ there are ^(A:) = h 
choices of I which are distinct modulo k, so that if the primes are more 
or less evenly dispersed among the various progressions, it is to be 
expected that 


ir(x; k, 1) 


1 X 
h logx 


It is the object of this section to show that this is the case, and in fact 
to obtain an estimate for 7r(x; k, 1) similar to that given in Theorem 
7-16 for ^(x) = 7r(x; 1, 0). Several proofs will not be given in full 
detail since they are similar to those of the preceding sections. As in 
the preceding chapter, we isolate the primes lying in a given arith¬ 
metic progression by use of characters and L-functions. The L-func- 
tions are in turn simple combinations of Hurwitz f-functions, as the 
following theorem shows. 


Theorem 7-17. Far a > 1, 

L{s, x) = ^ xCa)^ ^s, ^ • (14) 


Proof: Since x is periodic of period k, 

j f . ^ x(«) 

L{s,x) = Z —7" 

n-l ^ 

* " 1 
" x(®) t yj\9 

0-1 m -0 («^ + fl ) 

= x(a)r («. f) ■ 


k\-i 


The domain of validity of (14) can be extended somewhat, 
put 

E(x) = 


If we 


1 

0 


if X = Xo, 
if X 5*^ Xoi 


then the first equation of Theorem 6-6 becomes 


E x(a) = E{x)h. 

a 



PRIMES IN A PROGRESSION 


253 


7-41 

Hence, for tr > 1, 
r. , Eix)h 

L{s, x)-r— 


s - 1 




By Theorem 7-3, each summand on the right is regular for tr > 0, and 


1/1 1 \ _ k'-' - 1 
-lU' k) k{s-l) 


is an integral function. By analytic continuation, we have 

Theorem 7-18. The relation (14) holds for a > 0, except at s = 1. 
Moreover, 

hE{x) 


lim (s — l)L{s, x) = 


(15) 




Hence L{s, x) w regular for o > 0, except that L{s, xo) kas a simple 
pole at s — 1. 

For <r > 2 and t > 8, 


\L{s,x)\ < L -2 <2 < 


-1 n- 


I*’ 

Uog t, 


while for a > 0 and t > 0, 


11/(5, x)l < L 

a-1 



so that Thewem 7-4 yields 

Theorem 7-19. (a) For o>\ and t > Ci2(fc), we have 1L(«, x)| <t. 

(b) For f > 8 a?Mi <r > I — (log0“^, we have lL(s, x)l < 
ci3(*) 


The proof of the nonvanishing of {"(s) on a = 1 can be generalized 
in a simple way. 


Theorem 7-20. L(s, x) does not vanish on the line <r = 1. 
Proof: For <r > 1, 

■t'(s, x) = n (1 - x(p)p“*)“\ 



254 


THE PRIME NUMBER THEOREM 


ICHAP. 7 


SO that we can choose 


IogL(s, x) = E 


x(p”‘) 


Hence 


.P 


log + a, x)i'(o' + x^)l 

= 3 log |Z,(<7-, xo)| + 4 log \L{<t + ti, x)| + log |L(ff + 2ti, x^)[ 

= 3 log L{ffy xo) + 4 Re log L{<t ti, x) + Re \ogL{<7 + 2it, x^) 


= V / 3xo(p”-) 4x(p-) X^(P-) \ 

^ ^ 3 + 4 cos (jt(p”) — t log p") + cos 2 (t?(p"*) — f log p"*) 


4x(p”‘) , ^ X^(P"^) 


m.P 

Pik 


mp 


> 0 , 

where x(p'") = Thus 

((<r - l)L((r, Xo))^ \L{(T + 2ii, x^)| > —^» 

and the falsity of the theorem would contradict Theorem 7-18. 

By now the proof of the following analog of Theorem 7-10 should 
be a simple exercise for the reader. 

Theorem 7-21. For o- > 1, 

— 3 — (ff, Xo) — 4 Re — (<r H- tij x) — Re — (a + 2ti, x^) > 0. 


Theorem 7-13 becomes 


Theorem 7-22. There is a Ci(fc) > 8 such that L(s, x) ^ 0 /or 
t > Cl (k) and <t >\ — Ca/Iog t. 

The only difference in the proofs is that now the first inequality of 
Theorem 7-8 is applied with/(s) = L(s, x^) and So = <^ + 2(f, while 
the second is applied with/(s) = L{s, x) and so = <r + ti- Also, the 
constants now depend on k. After these trivial modifications, the 

proofs are identical. 

Similarly, replacing f(s) by L{s, x) throughout, Theorem 7-14 
becomes 



255 


7-41 


PRIMES IN A PROGRESSION 


Theorem 7-23. For i > CaCfc) > 8 and a > 1 - Cg (log i)"*. 

[log Z/(s, x)| < log^ 

The constant Cg{k) may be different from the cg of Theorem 7-14; 
the subscript is retained to facilitate reference to Fig. 7-3. In the 
same way, Cu becomes cu(fc). 

Instead of proceeding directly to the analog of Theorem 7-15, it is 
convenient to break the argument into two steps. 

Theorem 7-24. For {k, 1) = 1, we have 

Y, ^logL(5,x)<^s + 0(V^log2a:). 

P% V X x(0A2)S 

p MI (mod k) 

Proof: Using Theorem 7-9 and the series expansion for log 1/(5, x). 
we obtain 


2x1 


[ %^ogL{s,x)^s = [ 

J(2) S Jii 


1_^X(P 


m 


2irt 


m.p 


m 


■{ 

j{.2) 


(x/p")- 


ds 


x(p") log (x/p”) 


m,p ^ 

pw<* 

^ ^ ^ ^ xIp"*) log (x/p'") 

= Z x(p) log - + Z - 

P<X P ^ 

m> 2 


= Z x(p) log - + 0 {Vx log^ x). 

p<9 p 


Multiplying by l/x(0 and summing over all characters modulo k, we 
deduce with the help of Theorem 6-7 that 


Y, — 

T X(0 Ptr 


Z x(p) log - = A Z log - 

P pS* P 

pal (mod k) 


" i ii,^log »x). 


which is the theorem. (Here and throughout the remainder of this 
section, the implied constant in the 0-symbol may depend on k,) 



256 


THE PRIME NUMBER THEOREM 


To estimate the integrals appearing in Theorem 7-24, we must 
distinguish two cases. First consider the case x = xo* Every prop¬ 
erty of the integrand which was used in estimating 


L 


^logi-(s)d5 
( 2 ) 5 


carries over to the integrand of 


i. 


-glogLCs, xo) ds. 

(2) S 

It follows that for suitable c with 0 < c < 1, 

[ ^ log L(s, xo) * = 2« ^ + 0 {xe-"^). 

J{2) S Jc S 

On the other hand, if x 5*^ Xo» then L{s, x) has no pole at s = 1, but 
the other properties used earlier still obtain. Hence, if we do not 
cut the plane, but consider the line segments and Ts in Fig. 7-3 
as a single segment Tg, and omit Tg, r 7 , and Tg, then the function 

x) = -glogl/O, x) 

$ 

is regular in the region bounded by Fj, r 2 , Fg, F 4 , Fg, r 4 , Fg, Fg, Fi, 
so that 

f x) ds = f —f 4 */ )l^'(SiX)ds. 

7(2) \72-«i 7rj+r,+P4+r»+r4+r,+rj ^2-|-u» 

Moreover, the integral along each of these new arcs either tends to 
zero or is 

OCxc-*'^). 

It follows that 


72-I-u» / 


E 

P<* 

pai (mod k) 


'^ + 0{xe-^), (16) 


which is the analog of Theorem 7-15. In exactly the same way as 
Theorem 7-16 was deduced from Theorem 7-15, equation (16) leads 
to the desired result: 

Theorem 7-25. If k is a fixed integer and (k, 1) = 1, then, as 

<p(k) J 2 logw 



SUMS OF SQUARES 


257 


7-5) 


As consequences of Theorem 7-25, we have that 


?r{x, k , i) 


1 X 
<p{k) \ogx 


and that, if {k, h) = (fc, h) = then 



y(x; k, h) 
ir{x; k, h) 



so that asymptotically there are equally many primes in the progres¬ 
sions kt + h and kt 12 ‘ 

A serious drawback of Theorem 7—25 is that the error term is not 
uniform in k. This precludes applying this version of the theorem 
to problems in which k increases with x, and these unfortunately 
are among the most important applications of this kind of theorem. 
It is known that the error term in Theorem 7-25 is uniform in k for 
k < log”* X for some m > 0, in other words, that the relation dis¬ 
played in the theorem can be used if k increases sufficiently slowly 
with X. The proof of the more general theorem, while similar to that 
given here, is more complicated. The chief difficulty is this: when 
dealing with fixed k, it is enough to prove that L{s, x) 5^ 0 for 
s = 1 + in order to deduce that for some Cn (fc), L{s, x) 5^ 0 for 
1 _ cjj < \t\ < c^ik). When k increases, however, cu might 

tend to zero quite rapidly as a function of k, in which case the integral 
along Fg would not be negligible. It is therefore necessary to investi¬ 
gate further the zeros of the L-functions near the line a *= 1 for 
small |<|. 


7-5 The integers representable as a sum of two squares. As a 
final illustration of the methods of this chapter, we shall obtain an 
asymptotic estimate for B{x), the number of integers not exceeding 
X which can be written as a sum of two squares. The integers counted 
are exactly those in whose prime-power factorization the primes 
r s 3 (mod 4) occur only to even powers.* The following heuristic 
argument indicates that it is to be expected that B (x) is of the order 

of magnitude of x/Vlog x, which is in agreement with the result to 
be obtained. 

Take x very large. Since one out of every p integers is divisible 
by p, the number of integers up to x not divisible by p is about 


• Cf. Volume I, Theorem 7-3. 



258 

x(l - 1/p) 
roughly 


THE PRIME NUMBER THEOREM 


(chap. 7 


Hence the number not divisible by any p < \fx is 


n (i - 

<V^\ p/ 


so that, by the Prime Number Theorem, 


n (i - -) 

< vx \ Vf 


logx 


where the symbol means “is probably of the order of magni¬ 
tude of.” To count the integers contributing to B{x), we do not 
want to eliminate all composite numbers, but only those divisible 
by an odd power of any of the various primes r = 3 (mod 4). As in 
the cross-classification principle,* we can omit all those divisible 
by r, then reintroduce those divisible by r^, then take out those 
divisible by etc., giving 

■■■ 

as the number left after accounting for the one prime r. (The prod¬ 
uct has only finitely many factors.) Hence 

B{x) «X n n (i + 3)" ■' 

r< \ r*< Vi \ ^ / 

and since each product after the first converges as x -♦ «>, we can 
write simply 

B{x) «x n (1 -')• 

r<V3\ V 


B{x) 


r< Vi 


Now 


log n (1 --) = L iog(i -3 

p<viV V/ P<vi V P/ 


and since, by the results of the preceding section, about half the 
p’s are r’s, we have 


log n (1 -^log'og 

r<Vi\ ^ 


X = - log 


• Cf. Volume I, Theorem 6-4. 



7-5) 


SUMS OF SQUARES 


259 


SO that 



X 

Vlog X 


Probably the most that can be said for this argument is that after 
seeing it, the reader should not be very surprised to learn that, for 
some 6 > 0, 

(17) 

Vlog^ 


Nevertheless, it is just this type of reasoning which underlies the 
proof of (17) which will now be developed. 

If we put 



then 

For (T > 1 let 


if n = for some integers x, y, 

otherwise, 

B(x)= L 6„. 




the series converges absolutely in this domain, and uniformly in any 
closed bounded region to the right of the line a = 1. Using q and r to 
denote primes congruent to 1 and 3 (mod 4) respectively, we deduce 
from the definition of bn and Theorem 6-3 that, for a > 1, 

/(«) = (i + ^ + ^+-)n(i + i + ^+-) 

= (1 - 2 -)-> n (1 - ?-)-* n (1 - 

Q r 

As was pointed out in formula (5) of the preceding chapter, 

f(s)L(s) = (1 - n (1 - n (1 - (18) 

Q r 

where L(s) = L(s, x) is the Zr-function for the nonprincipal character 
(mod 4), 

0 

(-l)iCn-l) 


x(n) = 


if 2|n, 
if 2|n. 



260 THE PRIME NUMBER THEOREM (CHAP, 7 

(The relation (18) was proved earlier only for s > 1 and real; the 
extension to the half-plane > 1 is immediate.) Hence, for > 1, 

fis) = (1 - 2-*)-' n (1 - r-"*r‘!-(s)L(s). (19) 

r 

Since L is regular for tr > 0 and 

L(l) = 1—- + - — 

3 5 4 

the function fL has a simple pole at s = 1, with residue t/ 4, but is 
otherwise regular for a > 0. Moreover, neither f(s) nor L(s) is 
zero for s in the region Q of Fig. 7-3, for suitable positive Cu and Cg. 
Since the functions 

(1 - 2 -*)-' and n (1 - ( 20 ) 

r 

are regular and different from zero for <r > and bounded in abso¬ 
lute value for o- > (tq > i, we deduce the following properties of / 
from known properties of and L. 

Theorem 7-26. (a) f^(s) is regular and different from zero in 

the region Q of Fig. 7-3, for suitable Cu and C 9 , and it has a simple 
pole af s = 1 , with residue 

(1 ( 21 ) 

2 r 

Hence f is also regular in Q, and f^{s) • (s — 1) is regular in the 
uncut region Q' formed from Q by omitting Fe, r 7 , and Ps, and 
joining Fs and Ts. 

(b) For |/| > 8 and s in Q, the inequality |/(s)| < Cu log |<| haUs 
{cf. Theorem 7-11 (5)). 

From this follows the usual consequence. 


Theorem 7-27. For suitable c < 1, 


■1 -• 


Proof: The proof follows the lines of that of Theorem 7-15 as 
regards changing the path of integration in the relation 


- 

and estimating the new integrals along those paths which are 



SUMS OF SQUARES 


261 


7-5) 


bounded away from s = 1; the only change is that the estimate 
l/(s)| < Cu log lil is used rather than [log ^( 5 )! < log^ \t\. Omitting 
the tiresome details, we arrive at the relation 


Z 


Z 

n “1 




-5/(5) ds + 0 (xe 


—avToJr 


). 


In the neighborhood of s = !,/(«) has the expansion 



6 

Vs — 1 



9 


withbasin (21). Here Vs -1. > Ofors > 1. Puttings = 1 + 
we have 

d .=0 i = od) 

as 5 —♦ 0 . 

Since/''^(s)(s — 1) is single-valued in Q', the quantity 2 arg/(s) -|- 
arg (s — 1) is unchanged by traversing a path in Q from Tg to Pfl. 
Since arg (s — 1) increases by 2t, arg/(s) decreases by tt, so that/(s) 
has opposite signs on the two edges of the cut. Hence 

f ^f(s)ds + f ^f(s) ds = 2 f %f(s)ds, 

Jr, Jr, s^ Jr, s 

and E b„ log - = r ^f(s) ds + 

n-i n 

The proof is complete. 


Theorem 7-28. As x », 

B(x) = + 


o(. 


1 ) 


Vlogx \(logx)V 

-J 


where 




Proof: On Pg we have 

/(«) 


W 


Vl - 5 (1 - (1 - s)) 

bi , - 

+ 0(Vl - s) 


+ 0(VT^) 


VT^ 



262 THE PRIME NUMBER THEOREM 

as s ^ 1“, SO that 


(chap. 7 


E bn log - 

n=l n 


= -.[ ^ ds + o(f X‘y/T^s ds) + 0 (- 

^ Jl^cu 1 — S Vyi —cu / \1 

= ; I + 0 (/ %-ui du) + 0 (j^) 

= du + 0 dv^ + 0 ^ 


TT 70 \log x/ log X V 7o 


,log x/ log X 


\log^ X/ 

/ V ^ dv 

Vlogx/ logx 




log^ X, 


_ [ 

TrVlog X ^ 


CU log X 


’v-^dv + o(^ 


logt X 


« fe) 


TT Vlogx 


TrVlogX 




V^ + o([' 

\Jcu log X 


e-^dH + 


Thus 


£;6„iog^ = ;^ + 

n -1 ^ Vlogx 


^ (log* x) 

0 fc?-.)' 


where 


B = 4 = = 4 =n(i-'^r*' 

v; V2 r 


Now let 6 = 5(x) be positive. Then 


x+iz x + Sx * ^ , X 

X x+«x X + 5x 

= log (1 + 5) 6„ + Y, bn log - 

n = l n-x ” 

= log (1 + 5)/?(x) + 0 (log (1 + 5) ■ 5x), 



7-51 

while 


SUMS OF SQUARES 


263 


Bx (I + 5) 
'^log (x + 5x) 


Bx _ Bx f 1 + 5 _ ^ 

^log X Vlog X VVlog (x + 5x)/log X 

_ ^ + ^ 

Vlog X \Vl + log (1 + 6)/logx 

i_+j-A 

\1 + 0(6/log x) / 

_ Bx n + Q(S/logx) \ 

~ \1 + 0(5/logx)/ 


- 1 


- 1 


Bx 


Vlogx \1 + 0(5/logx) 


Bx 


(5 + 0(5/log x)). 


Vlog X 

Hence, since log (1 + 5) = 5 + 0(5^) as 5 0, 




(i)i 


+ 0(ax) + 


^ C log* x) 


Bx 


(1 + 0(6)) +o(j 


■ ■ Viog*x, 

Choosing 6(x) = log“^ x, we obtain 


+ «( 5 -ulr.) 


B(x) = 


Bx 

^iogx 


_ Bx 
Vlog X 

and the proof is complete. 




log* 




SUPPLEMENTARY READING 


Chapter 1 

Dickson, L. E., Introduction to the Theory of Numbers, Chicago: University 
of Chicago Press, 1929. 

Dickson, L. E., Modem Elementary Theory of Numbers, Chicago, Univer¬ 
sity of Chicago Press, 1939. 

Ford, L. R., An Introduction to the Theory of Aulomor-phic Functions, 
London: George Bell & Sons, Ltd., 1915. Reprinted, Chelsea Publishing 
Company, New York, 1951. 

Jones, B. W., The Arithmetic Theory of Quadratic Forms, Carus Mathe¬ 
matical Monograph /j^lO, Buffalo, N.Y.: Mathematical Association of 
America, 1950. Distributed by John Wiley <fc Sons, Inc., New York. 

Klein, F., Vorlesungen u6er die Theorie der ElliptischenModidfunktionen, 
Leipzig: Teubner Verlagsgesellschaft, 1890-1892. 

Chapter 2 

Hecke, E., Vorlesungen uber die Theorie der Algebraischen Zahlen, Leipzig: 
Akademische Verlagsgesellschaft m.b.H., 1923. Reprinted, Chelsea 
Publishing Company, New York, 1948. 

Landau, E., Vorlesungen uber Zahlentheorie, vol. 3, Leipzig: S. Hirzel 
Verlag, 1927. Reprinted, Chelsea Publishing Company, New York, 
1947. 

Pollard, H., The Theory of Algebraic Numbers, Carus Mathematical 
Monograph Buffalo, N.Y.: Mathematical Association of America, 
1950. Distributed by John Wiley & Sons, Inc., New York. 

Reid, L. W., Elements of the Theory of Algebraic Numbers, New York: 
The Macmillan Company, 1910. 

Weyl, H., Algebraic Theory of Numbers, Annals of Mathematics Studies, 
§\, Princeton: Princeton University Press, 1940. 

Chapter 3 

^NDAU, E., Vorlesungen uber Zahlentheorie, vol. 3. 

ORDELL, L. J., Three Lectures on FermaCs Last Theorem, New York: 
Cambridge University Press, 1921. 

265 



266 


SUPPLEMENTARY READING 


Chapter 5 

Gelfond, a. 0., The Approximation of Algebraic Numbers by Algebraic Num- 
bers and the Theory of Transcendental Numbers, American Mathematical 
Society Translation Providence: American Mathematical Society, 
1952. Translated from Uspekhi Maiematicheskikh Nauk (Moscow) 4, 
no. 4 (32), 19-49 (1949). 

Koksma-, J. F., Diophantische Approximationen, Berlin: Springer-Verlag 
OHG, 1936. (Ergebnisse der Mathematik, vol. 4, no. 4.) Reprinted, 
Chelsea Publishing Company, New York, 1951. 

Perron, 0., Irrationalzahlen, 2nd edition, Berlin; Walter De Gruyter 
& Co., 1929. 

Siegel, C. L., Transcendental Numbers, Annals of Mathematics Studies, 
#16, Princeton: Princeton University Press, 1949. 

Chapter 6 

Hasse, H., Vorlesungen uber Zahlentheorie, Berlin: Springer-Verlag OHG, 
1950. 

Landau, E., Vorlesungen uber Zahlentheorie, vol. 1. 

Chapter 7 

Estermann, T., Introduction to Modem Prime Number Theory, Cambridge 
Tracts, #41, New York; Cambridge University Press, 1952. 

Ingham, A. E., The Distribution of Prime Numbers, Cambridge Tracts, 
#30, New York: Cambridge University Press, 1932. 

Landau, E., Vorlesungen uber Zahlentheorie, vol. 2. 

Landau, E., Handbuch der Lehre von der Verteilung der Primzahlen, Leipzig; 
Teubner Verlagsgesellschaft, 1909. Reprinted, Chelsea Publishing 

Company, New York, 1953. , . , • j 

Landau, E., Einfuhrung in die Elementare und Analytische T^one der 

Algebraischen Zahlen und der Ideate, Leipzig; Teubner Verlagsgesell¬ 
schaft, 1918. Reprinted, Chelsea Publishing Company, New York, 1949. 



LIST OF SYMBOLS 


r, unimodular group, 8 
(a, 6, c), quadratic form, 15 
(/), group of automorphs, 25 

R, rational field, 38 

R[x], polynomials over R, 38 
deg p, degree of a polynomial, 38 
algebraic number field, 42 
Z, rational integers, 48 
R[9], integral domain, 48 
N, norm, 48, 68 

I, divides, 53, 65 

S, trace, 73 
n, 75, 124 

equivalent ideals, 82 
Kp, cyclotomic field, 85 

T, prime in Kp, 85 

II, exactly divides. 111 

IIII, 124 

height of polynomial, 124 
Markov’s constant, 166 
!■(«), Riemann’s function, 201 
^{k), group of residues prime to k, 207 
K ifiik), 207 

x(a), character, 210 
^{k), group of characters, 211 
^(s, x), Dirichlet's function, 214 
f(s, w), Hurwitz’ function, 232 
Q, region of integration, 247 

irCa:; k, 1) number of primes p = I (mod k) with p < x, 252. 


267 




INDEX 


A-number, 171 
associate, 53 
automorph, 18 

Barnes, E. S., 81 
basis, integral, 50 
of a field, 50 
of a group of units, 75 
of an ideal, 59 
of a pure cubic field, 105 
of Kp, 87 
of R[Vdl 54 

Cantor, G., 166 

Catalan, E., 154, 160 

character, 210 

class number, 83 

completely multiplicative, 203 

congruence, modulo an ideal, 67 

conjugate algebraic numbers, 40 

Dedekind, R., 72, 105, 120 
Delaunay, B., 112, 120 
Dirichlet, P. L., 75, 201 
Dirichlet series, 201 
discriminant, of a field, 52 
of an ideal, 61 
of a quadratic form, 4 
of a set of algebraic numbers, 49 
of Kj,, 86 
of R[Vd], 55 
domain, Euclidean, 56 
integral, 48 

unique factorization, 56 
Dyson, F. J., 123, 160 

Eisenstein’s irreducibility criterion 
46, 67 


equivalent ideals, 82 
equivalent points, 9 
Euler’s constant, 161 
extension, algebraic, 44 

Fermat’s conjecture, 93 
field, 41 

algebraic number, 42 
cyclotomic, 85 
pure cubic, 104 
field conjugate, 43 
fundamental region, 9 

of r. 9 
of rA(/).26 

Fundamental Theorem of Algebra, 
35 

Gauss, C. F., 63 
Gelfond, A. 0.. 188, 198, 200 
greatest common divisor, of ideals, 
65 

group, 6 

Hadamard, J., 229 
height, of an algebraic number, 124 
Hilbert-Gelfond-Schneider theorem, 
198 

Hille, E., 200 
Hurwitz, A., 63, 121 
Hurwitz f-function, 232 

ideal, 58 
prime, 66 
principal, 58 

index of a polynomial, 135 
Inkeri, K., 81 
integer, algebraic, 47 
rational, 47 


269 



270 


INDEX 


irrationality, of e, 162 
of TT, 163 

Kummer, E., 97 

law of quadratic reciprocity, 92 
Lehmer, D. H. and E., 103, 120 
LeVeque, W. J., 172 
Liouville, J., 121, 160, 165, 200 
Liouville number, 165 
Liouville's theorem, 121 

Mahler, K., 155, 160, 171, 174, 
200 
matrix, 2 

of a quadratic form, 4 
measure of transcendence, 170 
for e, 186 
modular group, 8 
Mordell, L. J., 120, 155 

Nagell, T., 112, 120, 229 
Newman, M., 157, 160 
Niven, I., 163, 200 
norm, of an algebraic number, 48 
of an ideal, 68 
number, algebraic, 39 

Obiah, R., 160 

Pell's equation, 25, 55, 74, 154 
period of reduced forms, 31 
P61ya, G., 187, 200 
polynomial, monic, 38 
primary, 88 
prime, algebraic, 55 
regular, 97 

primitive element, 44 
product, of determinants, 35 
of ideals, 62 


proper representation, 19 

quadratic form, 1 
definite, 15 
equivalent, 2 
indefinite, 22 
integral, 18 
primitive, 21 
reduced, 5. 16, 23 

representative of a form, 16 
residue class (mod A), 67 
Riemann {"-function, 201 
roots of unity, 75, 85 
Rosser, J. B., 103 
Roth, K. F., 123, 160 

S-number, 171 

Schneider, T, 123, 160, 187, 198, 

200 

Siegel, C. L., 123, 160, 198, 200 
Swinnerton-Dyer, H.P.F., 81 
Symmetric Function Theorem, 35 

T-number, 171 
Thue, A., 122, 160 
Thue-Siegel-Roth Theorem, 148 
trace, 73 

transcendence, of e, 186, 199 
of IT, 186 

transcendental number, 165 

[/-number, 171 
unit of 53 

Vandiver, H. S., 103, 120 
Varnavides, P., 81 

Wronskian, 128 
generalized, 129 


Library 


I tflAABW 


Urn 


* # • 






